DIAGNOSIS AND TREATMENT OF CANCERS WITH MicroRNA LOCATED IN OR NEAR CANCER ASSOCIATED CHROMOSOMAL FEATURES

ABSTRACT

MicroRNA genes are highly associated with chromosomal features involved in the etiology of different cancers. The perturbations in the genomic structure or chromosomal architecture of a cell caused by these cancer-associated chromosomal features can affect the expression of the miR gene(s) located in close proximity to that chromosomal feature. Evaluation of miR gene expression can therefore be used to indicate the presence of a cancer-causing chromosomal lesion in a subject. As the change in miR gene expression level caused by a cancer-associated chromosomal feature may also contribute to cancerigenesis, a given cancer can be treated by restoring the level of miR gene expression to normal. microRNA expression profiling can be used to diagnose cancer and predict whether a particular cancer is associated with an adverse prognosis. The identification of specific mutations associated with genomic regions that harbor miR genes in CLL patients provides a means for diagnosing CLL and possibly other cancers.

RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.11/194,055, filed Jul. 29, 2005, which is a continuation-in-part ofInternational Application No. PCT/US2005/004865, filed Feb. 9, 2005,which claims the benefit of U.S. Provisional Application No. 60/543,119,filed Feb. 9, 2004, U.S. Provisional Application No. 60/542,929, filedFeb. 9, 2004, U.S. Provisional Application No. 60/542,963, filed Feb. 9,2004, U.S. Provisional Application No. 60/542,940, filed Feb. 9, 2004,U.S. Provisional Application No. 60/580,959, filed Jun. 18, 2004, andU.S. Provisional Application No. 60/580,797, filed Jun. 18, 2004. Theentire teachings of the above applications are incorporated herein byreference.

GOVERNMENT SUPPORT

The invention described herein was supported in part by grant nos.P01CA76259, P01CA81534, and P30CA56036 from the National CancerInstitute. The U.S. government has certain rights in this invention

FIELD OF THE INVENTION

The invention relates to the diagnosis of cancers, or the screening ofindividuals for the predisposition to cancer, by evaluating the statusof at least one miR gene located in close proximity to chromosomalfeatures, such as cancer-associated genomic regions, fragile sites,human papilloma virus integration sites, and homeobox genes and geneclusters. The invention also relates to the treatment of cancers byaltering the amount of gene product produced from miR genes located inclose proximity to these chromosomal features. The invention furtherprovides methods of diagnosing CLL and other cancers by screening formutations in miR genes.

BACKGROUND OF THE INVENTION

Taken as a whole, cancers are a significant source of mortality andmorbidity in the U.S. and throughout the world. However, cancers are alarge and varied class of diseases with diverse etiologies. Researcherstherefore have been unable to develop treatments or diagnostic testswhich cover more than a few types of cancer.

For example, cancers are associated with many different classes ofchromosomal features. One such class of chromosomal features areperturbations in the genomic structure of certain genes, such as thedeletion or mutation of tumor suppressor genes. The activation ofproto-oncogenes by gene amplification or promoter activation (e.g., byviral integration), epigenetic modifications (e.g., a change in DNAmethylation) and chromosomal translocations can also causecancerigenesis. Such perturbations in the genomic structure which areinvolved in the etiology of cancers are called “cancer-associatedgenomic regions” or “CAGRs.”

Chromosomal fragile sites are another class of chromosomal featureimplicated in the etiology of cancers. Chromosomal fragile sites areregions of genomic DNA which show an abnormally high occurrence of gapsor breaks when DNA synthesis is perturbed during metaphase. Thesefragile sites are categorized as “rare” or “common.” As their namesuggests, rare fragile sites are uncommon. Such sites are associatedwith di- or tri-nucleotide repeats, can be induced in metaphasechromosomes by folic acid deficiency, and segregate in a Mendelianmanner. An exemplary rare fragile site is the Fragile X site.

Common fragile sites are revealed when cells are grown in the presenceof aphidocolin or 5-azacytidine, which inhibit DNA polymerase. At leasteighty-nine common fragile sites have been identified, and at least onesuch site is found on every human chromosome. Thus, while their functionis poorly understood, common fragile sites represent a basic componentof the human chromosome structure.

Induction of fragile sites in vitro leads to increased sister-chromatidexchange and a high rate of chromosomal deletions, amplifications andtranslocations, while fragile sites have been colocalized withchromosome breakpoints in vivo. Also, most common fragile sites studiedin tumor cells contain large, intra-locus deletions or translocations,and a number of tumors have been identified with deletions in multiplefragile sites. Chromosomal fragile sites are therefore mechanisticallyinvolved in producing many of the chromosomal lesions commonly seen incancer cells.

Cervical cancer, which is the second leading cause of female cancermortality worldwide, is highly associated with human papillomavirus(HPV) infection. Indeed, sequences from the HPV 16 or HPV 18 viruses arefound in cells from nearly every cervical tumor cell examined. Inmalignant forms of cervical cancer, the HPV genome is found integratedinto the genome of the cancer cells. HPV preferentially integrates in ornear common chromosomal fragile sites. HPV integration into a host cellgenome can cause large amplification, deletions or rearrangements nearthe integration site. Expression of cellular genes near the HPVintegration site can therefore be affected, which may contribute to theoncogenesis of the infected cell. These sites of HPV integration into ahost cell genome are therefore considered another class of chromosomalfeature that is associated with a cancer.

Homeobox genes are a conserved family of regulatory genes that containthe same 183-nucleotide sequence, called the “homeobox.” The homeoboxgenes encode nuclear transcription factors called “homeoproteins,” whichregulate the expression of numerous downstream genes important indevelopment. The homeobox sequence itself encodes a 61 amino acid“homeodomain” that recognizes and binds to a specific DNA binding motifin the target developmental genes. Homeobox genes are categorized as“class I” or “clustered” homeobox genes, which regulate antero-posteriorpatterning during embryogenesis, or “class II” homeobox genes, which aredispersed throughout the genome. Altogether, the homeobox genes accountfor more than 0.1% of the vertebrate genome.

The homeobox genes are believed to “decode” external inductive stimulithat signal a given cell to proceed down a particular developmentallineage. For example, specific homeobox genes might be activated inresponse to various growth factors or other external stimuli thatactivate signal transduction pathways in a cell. The homeobox genes thenactivate and/or repress specific programs of effector or developmentalgenes (e.g., morphogenetic molecules, cell-cycle regulators, pro- oranti-apoptotic proteins, etc.) to induce the phenotype “ordered” by theexternal stimuli. The homeobox system is clearly highly coordinatedduring embryogenesis and morphogenesis, but appears to be dysregulatedduring oncogenesis. Such dysregulation likely occurs because ofdisruptions in the genomic structure or chromosomal architecturesurrounding the homeobox genes or gene clusters. The homeobox genes orgene clusters are therefore considered yet another chromosomal featurewhich are associated with cancers.

Micro RNAs (miRs) are naturally-occurring 19 to 25 nucleotidetranscripts found in over one hundred distinct organisms, includingfruit flies, nematodes and humans. The miRs are typically processed from60- to 70-nucleotide foldback RNA precursor structures, which aretranscribed from the miR gene. The miR precursor processing reactionrequires Dicer RNase III and Argonaute family members (Sasaki et al.(2003), Genomics 82, 323-330). The miR precursor or processed miRproducts are easily detected, and an alteration in the levels of thesemolecules within a cell can indicate a perturbation in the chromosomalregion containing the miR gene.

To date, at least 222 separate miR genes have been identified in thehuman genome. Two miR genes (miR15a and miR16a) have been localized to ahomozygously deleted region on chromosome 13 that is correlated withchronic lymphocytic leukemia (Calin et al. (2002), Proc. Natl. Acad.Sci. USA 99:15524-29), and the miR-143/miR145 gene cluster isdownregulated in colon cancer (Michael et al. (2003), Mol. Cancer. Res.1:882-91). However, the distribution of miR genes throughout the genome,and the relationship of the miR genes to the diverse chromosomalfeatures discussed herein, has not been systematically studied.

A method for reliably and accurately diagnosing, or for screeningindividuals for a predisposition to, cancers associated with suchdiverse chromosomal features as CAGRs, fragile sites, HPV integrationsites and homeobox genes is needed. A method of treating cancersassociated with these diverse chromosomal features is also highlydesired.

SUMMARY OF THE INVENTION

It has now been discovered that miR genes are commonly associated withchromosomal features involved in the etiology of different cancers. Theperturbations in the genomic structure or chromosomal architecture of acell caused by a cancer-associated chromosomal feature can affect theexpression of the miR gene(s) located in close proximity to thatchromosomal feature. Evaluation of miR gene expression can therefore beused to indicate the presence of a cancer-causing chromosomal lesion ina subject. As the change in miR gene expression level caused by acancer-associated chromosomal feature may also contribute tocancerigenesis, a given cancer can be treated by restoring the level ofmiR gene expression to normal.

The invention therefore provides a method of diagnosing cancer in asubject. The cancer can be any cancer associated with acancer-associated chromosomal feature. As used herein, acancer-associated chromosomal feature includes, but is not limited to, acancer-associated genomic region, a chromosomal fragile site, a humanpapillomavirus integration site on a chromosome of the subject, and ahomeobox gene or gene cluster on a chromosome of the subject. The cancercan also be any cancer associated with one or more adverse prognosticmarkers, including cancers associated with positive ZAP-70 expression,an unmutated IgV_(H) gene, positive CD38 expression, deletion atchromosome 11q23, and loss or mutation of TP53. In one embodiment, thediagnostic method comprises the following steps. In a sample obtainedfrom a subject suspected of having a cancer associated with acancer-associated chromosomal feature, the status of at least one miRgene located in close proximity to the cancer-associated chromosomalfeature is evaluated by measuring the level of at least one miR geneproduct from the miR gene in the sample, provided the miR genes are notmiR-15, miR-16, miR-143 or miR-145. An alteration in the level of miRgene product in the sample relative to the level of miR gene product ina control sample is indicative of the presence of the cancer in thesubject. In a related embodiment, the diagnostic method comprisesevaluating in a sample obtained from a subject suspected of having acancer associated with a cancer-associated chromosomal feature, thestatus of at least one miR gene located in close proximity to thecancer-associated chromosomal feature, provided the miR gene is notmiR-15 or miR-16, by measuring the level of at least one miR geneproduct from the miR gene in the sample. An alteration in the level ofmiR gene product in the sample relative to the level of miR gene productin a control sample is indicative of the presence of the cancer in thesubject.

The status of the at least one miR gene in the subject's sample can alsobe evaluated by analyzing the at least one miR gene for a deletion,mutation and/or amplification. The detection of a deletion, mutationand/or amplification in the miR gene relative to the miR gene in acontrol sample is indicative of the presence of the cancer in thesubject. The status of the at least one miR gene in the subject's samplecan also be evaluated by measuring the copy number of the at least onemiR gene in the sample, wherein a copy number other than two for miRgenes located on any chromosome other than a Y chromosome, and otherthan one for miR genes located on a Y chromosome, is indicative of thesubject either having or being at risk for having a cancer. In oneembodiment, the diagnostic method comprises analyzing at least one miRgene in the sample for a deletion, mutation and/or amplification,wherein detection of a deletion, mutation and/or amplification in themiR gene relative to the miR gene in a control sample is indicative ofthe presence of the cancer in the subject. In a related embodiment, thediagnostic method comprises analyzing at least one miR gene in thesample for a deletion, mutation or amplification, provided the miR geneis not miR-15 or miR-16, wherein detection of a deletion, mutationand/or amplification in the miR gene relative to the miR gene in acontrol sample is indicative of the presence of the cancer in thesubject. In a further embodiment, the diagnostic method comprisesanalyzing the miR-16 gene in the sample for a specific mutation,depicted in SEQ ID NO. 642, wherein detection of the mutation in themiR-16 gene relative to a miR-16 gene in a control sample is indicativeof the presence of the cancer in the subject.

The invention also provides a method of screening subjects for apredisposition to develop a cancer associated with a cancer-associatedchromosomal feature, by evaluating the status of at least one miR genelocated in close proximity to the cancer-associated chromosomal featurein the same manner described herein for the diagnostic method. Thecancer can be any cancer associated with a cancer-associated chromosomalfeature.

In one embodiment, the level of the at least one miR gene product fromthe sample is measured by quantitatively reverse transcribing the miRgene product to form a complementary target oligodeoxynucleotide, andhybridizing the target oligodeoxynucleotide to a microarray comprising aprobe oligonucleotide specific for the miR gene product. In anotherembodiment, the levels of multiple miR gene products in a sample aremeasured in this fashion, by quantitatively reverse transcribing the miRgene products to form complementary target oligodeoxynucleotides, andhybridizing the target oligodeoxynucleotides to a microarray comprisingprobe oligonucleotides specific for the miR gene products. In anotherembodiment, the multiple miR gene products are simultaneously reversetranscribed, and the resulting set of target oligodeoxynucleotides aresimultaneously exposed to the microarray.

In a related embodiment, the invention provides a method of diagnosingcancer in a subject, comprising reverse transcribing total RNA from asample from the subject to provide a set of labeled targetoligodeoxynucleotides; hybridizing the target oligodeoxynucleotides to amicroarray comprising miRNA-specific probe oligonucleotides to provide ahybridization profile for the sample; and comparing the samplehybridization profile to the hybridization profile generated from acontrol sample, an alteration in the profile being indicative of thesubject either having, or being at risk for developing, a cancer. Themicroarray of miRNA-specific probe oligonucleotides preferably comprisesmiRNA-specific probe oligonucleotides for a substantial portion of thehuman miRNome, the full complement of microRNA genes in a cell. Themicroarray more preferably comprises at least about 60%, 70%, 80%, 90%,or 95% of the human miRNome. In one embodiment, the cancer is associatedwith a cancer-associated chromosomal feature, such as acancer-associated genomic region or a chromosomal fragile site. Inanother embodiment, the cancer is associated with one or more adverseprognostic markers. In a particular embodiment, the cancer is B-cellchronic lymphocytic leukemia. In a further embodiment, the cancer is asubset of B-cell chronic lymphocytic leukemia that is associated withone or more adverse prognostic markers. As used herein, an adverseprognostic marker is any indicator, such as a specific geneticalteration or a level of expression of a gene, whose presence suggestsan unfavorable prognosis concerning disease progression, the severity ofthe cancer, and/or the likelihood of developing the cancer.

The invention further provides a method of treating a cancer associatedwith a cancer-associated chromosomal feature in a subject. The cancercan be any cancer associated with a cancer-associated chromosomalfeature, for example, cancers associated with a cancer-associatedgenomic region, a chromosomal fragile site, a human papillomavirusintegration site on a chromosome of the subject, or a homeobox gene orgene cluster on a chromosome of the subject. Furthermore, the cancer isa cancer associated with a cancer-associated chromosomal feature inwhich at least one isolated miR gene product from a miR gene located inclose proximity to the cancer-associated chromosomal feature isdown-regulated or up-regulated in cancer cells of the subject, ascompared to control cells. When the at least one isolated miR geneproduct is down regulated in the subject's cancer cells, the methodcomprises administering to the subject, an effective amount of at leastone isolated miR gene product from the at least one miR gene, such thatproliferation of cancer cells in the subject is inhibited. When the atleast one isolated miR gene product is up-regulated in the cancer cells,an effective amount of at least one compound for inhibiting expressionof the at least one miR gene is administered to the subject, such thatproliferation of cancer cells in the subject is inhibited.

The invention further provides a method of treating cancer associatedwith a cancer-associated chromosomal feature in a subject, comprisingthe following steps. The amount of miR gene product expressed from atleast one miR gene located in close proximity to the cancer-associatedchromosomal region in cancer cells from the subject is determinedrelative to control cells. If the amount of the miR gene productexpressed in the cancer cells is less than the amount of the miR geneproduct expressed in control cells, the amount of miR gene productexpressed in the cancer cells is altered by administering to the subjectan effective amount of at least one isolated miR gene product from themiR gene, such that proliferation of cancer cells in the subject isinhibited. If the amount of the miR gene product expressed in the cancercells is greater than the amount of the miR gene product expressed incontrol cells, the amount of miR gene product expressed in the cancercells is altered by administering to the subject an effective amount ofat least one compound for inhibiting expression of the at least one miRgene, such that proliferation of cancer cells in the subject isinhibited.

The invention further provides pharmaceutical compositions comprising apharmaceutically acceptable carrier and at least one miR gene product,or a nucleic acid expressing at least one miR gene product, from an miRgene located in close proximity to a cancer-associated chromosomalfeature, provided the miR gene product is not miR-15 or miR-16.

The invention still further provides for the use of at least one miRgene product, or a nucleic acid expressing at least one miR geneproduct, from an miR gene located in close proximity to acancer-associated chromosomal feature for the manufacture of amedicament for the treatment of a cancer associated with acancer-associated chromosomal region.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an image of a Northern blot analysis of the expression ofmiR-16a (upper panel), miR-26a (middle panel), and miR-99a (lower panel)in normal human lung (lane 1) and human lung cancer cells (lanes 2-8).Below the three blots is an image of an ethidium bromide-stained gelindicating the 55 RNA lane loading control. The genomic location and thetype of alteration are indicated.

FIG. 2 is a schematic representation demonstrating the position ofvarious miR genes on human chromosomes in relation to HOX gene clusters.

FIG. 3 shows an miRNome expression analysis of 38 individual CLLsamples. The main miR-associated CLL clusters are presented. The controlsamples are: MNC, mononuclear cells; Ly, Diffuse large B cell lymphoma;CD5, selected CD5+ B lymphocytes.

FIG. 4 is an image of a Northern blot analysis of the expression ofmiR-16a (upper panel), miR-26a (middle panel), and miR-99a (lower panel)in 12 B-CLL samples. Below the three blots is an image of an ethidiumbromide-stained gel indicating the 5S RNA lane loading control. miR-16aexpression levels varied in these B-CLL cases, and were either low orabsent in several of the samples tested. However, the expression levelsof miR-26a and miR-99a, both regions not involved in B-CLL, wererelatively constant in the tested samples.

FIG. 5 shows Kaplan-Meier curves depicting the relationship betweenmiRNA expression levels and the time from diagnosis to either the timeof initial therapy or the present, if therapy had not commenced. Theproportion of untreated patients with CLL is plotted against time sincediagnosis. The patients are grouped according to the expression profilegenerated by 11 microRNA genes.

FIG. 6 shows the expression levels of miR-16-1 and miR-15a miRNAs insamples from two patients with a miR-16-1 mutation (see SEQ ID NO. 642)and in CD5+ cell samples from normal patients, both by Northern blotanalysis (upper panels) and by miRNACHIP (expression level indicated bynumbers below panels).

FIG. 7A is a schematic depicting the locations of mutations affectingvarious miRNAs. The mutated (below chromosome) and normal (abovechromosome) nucleotide base is presented for each mutation/polymorphism.The figure is not drawn to scale.

FIG. 7B depicts the RT-PCR amplification products of primary transcriptscorresponding to various mutant miR gene products for which mutationshave been identified in B-CLL cells, as well as the length of theamplified genomic DNA (G). GAPDH levels were used for normalization;RT+=reverse transcription, RT−=control without reverse transcription,G=genomic control.

FIG. 7C presents the chromatograms for the genomic regions of sampleshaving either normal miR-16-1/15a (top) or mutated miR-16-1/15a (CtoT)+7(bottom). The precise position of the precursor (line with period atend) and the location of the mutation (arrowheads) are indicated.

FIG. 7D shows the expression levels by miRNACHIP (MAr) and Northern blot(NB) analysis for miR-16-1 and miR-15a in samples from two normal CD5pools (CD5+) and from both of the patients carrying the germline(CtoT)+7 mutation (CLL). The Northern blot band intensities werequantified using ImageQuantTL (Nonlinear Dynamics Ltd.). Data arepresented as arbitrary units.

FIG. 7E is a Northern blot showing that the germline mutation in thepri-miR-16-1 is associated with abnormal expression of the active,mature miR-16-1 molecule. Levels of expression were assessed in 293cells transfected with miR-16-1-WT, miR-16-1-MUT or Empty vector (EmptyV), as indicated. Untransfected 293 cells were tested as a control.Normalization for loading was performed with a U6 probe (miR-15a; leftpanel) and the transfection levels were normalized with anti-GFP signalon cell lysates from the same pellet as that used for Northern blotting(miR-16-1; right panel).

DETAILED DESCRIPTION OF THE INVENTION

All nucleic acid sequences herein are given in the 5′ to 3′ direction.In addition, genes are represented by italics, and gene products arerepresented by normal type; e.g., mir-17 is the gene and miR-17 is thegene product.

It has now been discovered that the genes that comprise the miR genecomplement of the human genome (or “miRNome”) are non-randomlydistributed throughout the genome in relation to each other. Forexample, of 222 human miR genes, at least ninety are located inthirty-six gene clusters, typically with two or three miR genes percluster (median=2.5). The largest cluster is composed of six geneslocated on chromosome 13 at 13q31; the miR genes in this cluster aremiR-17/miR-18/miR-19a/miR-20/miR-19b1/miR-92-1.

The human miR genes are also non-randomly distributed across the humanchromosomal complement. For example, chromosome 4 has aless-than-expected rate of miRs, and chromosomes 17 and 19 containsignificantly more miR genes than expected based on chromosome size.Indeed, six of the thirty-six miR gene clusters (17%), containing 16 of90 clustered genes (18%), are located on chromosomes 17 and 19, whichaccount for only 5% of the entire human genome.

The sequences of the gene products of 187 miR genes are provided inTable 1. The location and distribution of these 187 miR genes in thehuman genome is given in Tables 2 and 3; see also Example 1. All Tablesare located in the Examples section below. As used herein, an “miR geneproduct” or “miRNA” means the unprocessed or processed RNA transcriptfrom an miR gene. As the miR gene products are not translated into aprotein, the term “miR gene products” does not include proteins.

A used herein, “probe oligonucleotide” refers to an oligonucleotide thatis capable of hybridizing to a target oligonucleotide. “Targetoligonucleotide” or “target oligodeoxynucleotide” refers to a moleculeto be detected (e.g., in a hybridization). By “miR-specific probeoligonucleotide” or “probe oligonucleotide specific for an miR” is meanta probe oligonucleotide that has a sequence selected to hybridize to aspecific miR gene product, or to a reverse transcript of the specificmiR gene product.

The unprocessed miR gene transcript is also called an “miR precursor,”and typically comprises an RNA transcript of about 70 nucleotides inlength. The miR precursor can be processed by digestion with an RNAse(such as, Dicer, Argonaut, or RNAse III, e.g., E. coli RNAse III)) intoan active 19-25 nucleotide RNA molecule. This active 19-25 nucleotideRNA molecule is also called the “processed miR gene transcript.”

The active 19-25 nucleotide RNA molecule can be obtained from the miRprecursor through natural processing routes (e.g., using intact cells orcell lysates) or by synthetic processing routes (e.g., using isolatedprocessing enzymes, such as isolated Dicer, Argonaut, or RNAase III). Itis understood that the active 19-25 nucleotide RNA molecule can also beproduced directly by biological or chemical syntheses, without havingbeen processed from the miR precursor. For ease of discussion, such adirectly produced active 19-25 nucleotide RNA molecule is also referredto as a “processed miR gene product.”

As used herein, “miR gene expression” refers to the production of miRgene products from an miR gene, including processing of the miRprecursor into a processed miR gene product.

The human miR genes are closely associated with different classes ofchromosomal features that are themselves associated with cancer. As usedherein, a “cancer-associated chromosomal feature” refers to a region ofa given chromosome, which, when perturbed, is correlated with theoccurrence of at least one human cancer. As used herein, a chromosomalfeature is “correlated” with a cancer when the feature and the canceroccur together in individuals of a study population in a manner notexpected on the basis of chance alone.

A region of a chromosome is “perturbed” when the chromosomalarchitecture or genomic DNA sequence in that region is disturbed ordiffers from the normal architecture or sequence in that region.Exemplary perturbations of chromosomal regions include, e.g.,chromosomal breakage and translocation, mutations, deletions oramplifications of genomic DNA, a change in the methylation pattern ofgenomic DNA, the presence of fragile sites, and the presence of viralintegration sites. One skilled in the art would recognize that otherchromosomal perturbations associated with a cancer are possible.

It is understood that a cancer-associated chromosomal feature can be achromosomal region where perturbations are known to occur at a higherrate than at other regions in the genome, but where the perturbation hasnot yet occurred. For example, a common chromosomal breakpoint orfragile site is considered a cancer-associated chromosomal feature, evenif a break has not yet occurred. Likewise, a region in the genomic DNAknown as a mutational “hotspot” can be a cancer-associated chromosomalfeature, even if no mutations have yet occurred in the region.

One class of cancer-associated chromosomal feature which is closelyassociated with miR genes in the human genome is a “cancer-associatedgenomic region” or “CAGR” (see Table 4). As used herein, a “CAGR”includes any region of the genomic DNA that comprises a genetic orepigenetic change (or the potential for a genetic or epigenetic change)that differs from normal DNA, and which is correlated with a cancer.Exemplary genetic changes include single- and double-stranded breaks(including common breakpoint regions in or near possible oncogenes ortumor-suppressor genes); chromosomal translocations; mutations,deletions, insertions (including viral, plasmid or transposonintegrations) and amplifications (including gene duplications) in theDNA; minimal regions of loss-of-heterozygosity (LOH) suggestive of thepresence of tumor-suppressor genes; and minimal regions of amplificationsuggestive of the presence of oncogenes. Exemplary epigenetic changesinclude any changes in DNA methylation patterns (e.g., DNA hyper- orhypo-methylation, especially in promoter regions). As used herein,“cancer-associated genomic region” or “CAGR” specifically excludeschromosomal fragile sites or human papillomavirus insertion sites.

Many of the known miR genes in the human genome are in or near CAGRs,including 80 miR genes that are located exactly in minimal regions ofLOH or minimal regions of amplification correlated to a variety ofcancers. Other miR genes are located in or near breakpoint regions,deleted areas, or regions of amplification. The distribution of miRgenes in the human genome relative to CAGRs is given in Tables 6 and 7and in Example 4A below.

As used herein, an miR gene is “associated” with a given CAGR when themiR gene is located in close proximity to the CAGR; i.e., when the miRis located within the same chromosomal band or within 3 megabases (3 Mb)of the CAGR. See Tables 6 and 7 and Example 4A below for a descriptionof cancers which are correlated with CAGRs, and a description of miRsassociated with those CAGRs.

For example, cancers associated with CAGRs include leukemia (e.g., AML,CLL, pro-lymphocytic leukemia), lung cancer (e.g., small cell andnon-small cell lung carcinoma), esophageal cancer, gastric cancer,colorectal cancer, brain cancer (e.g., astrocytoma, glioma,glioblastoma, medulloblastoma, meningioma, neuroblastoma), bladdercancer, breast cancer, cervical cancer, epithelial cancer,nasopharyngeal cancer (e.g., oral or laryngeal squamous cell carcinoma),lymphoma (e.g., follicular lymphoma), uterine cancer (e.g., malignantfibrous histiocytoma), hepatic cancer (e.g., hepatocellular carcinoma),head-and-neck cancer (e.g., head-and-neck squamous cell carcinoma),renal cancer, male germ cell tumors, malignant mesothelioma,myelodysplastic syndrome, ovarian cancer, pancreatic or biliary cancer,prostate cancer, thyroid cancer (e.g., sporadic follicular thyroidtumors), and urothelial cancer.

Examples of miR genes associated with CAGRs include miR-153-2, let-71,miR-33a, miR-34a-2, miR34a-1, let-7a-1, let-7d; let-7f-1, miR-24-1,miR-27b, miR-23b, miR-181a; miR-199b, miR-218-1, miR-31, let-7a-2,let-7g, miR-21, miR-32a-1, miR-33b, miR-100, miR-101-1, miR-125b-1,miR-135-1, miR-142 as, miR-142s; miR-144, miR-301, miR-297-3,miR-155(BIC), miR-26a, miR-17, miR-18, miR-19a, miR-19b1, miR-20,miR-92-1, miR-128a, miR-7-3, miR-22, miR-123, miR-132, miR-149, miR-161;miR-177, miR-195, miR-212, let-7c, miR-99a, miR-125b-2, miR-210,miR-135-2, miR-124a-1, miR-208, miR-211, miR-180, miR-145, miR-143,miR-127, miR-136, miR-138-1, miR-154, miR-134, miR-299, miR-203, miR-34,miR-92-2, miR-19b-2, miR-108-1, miR-193, miR-106a, miR-29a, miR-29b,miR-129-1, miR-182s, miR-182 as, miR-96, miR-183, miR-32, miR-159-1,miR-192 and combinations thereof.

Specific groupings of miR gene(s) that are associated with a particularcancer are evident from Tables 6 and 7, and are preferred. For example,acute myeloid leukemia (AML) is associated with miR-153-2, andadenocarcinoma of the lung or esophagus is associated with let-71. Wheremore than one miR gene is listed in Tables 6 and 7, it is understoodthat the cancer associated with those genes can be diagnosed byevaluating any one of the listed miR genes, or by evaluating anycombination of the listed miR genes. Subgenera of CAGRs or associatedwith miR gene(s) would also be evident to one of ordinary skill in theart from Tables 6 and 7.

Another class of cancer-associated chromosomal feature which is closelyassociated with miR genes in the human genome is a “chromosomal fragilesite” or “FRAs” (see Table 4 and Example 2). As used herein, a “FRA”includes any rare or common fragile site in a chromosome; e.g., one thatcan be induced by subjecting a cell to stress during DNA replication.For example, a rare FRA can be induced by subjecting the cell to folicacid deficiency during DNA replication. A common FRA can be induced bytreating the cell with aphidocolin or 5-azacytidine during DNAreplication. The identification or induction of chromosomal fragilesites is within the skill in the art; see, e.g., Arlt et al. (2003),Cytogenet. Genome Res. 100:92-100 and Arlt et al. (2002), Genes,Chromosomes and Cancer 33:82-92, the entire disclosures of which areherein incorporated by reference.

Approximately 20% of the known human miR genes are located in (13 miRs)or within 3 Mb (22 miRs) of cloned FRAs. Indeed, the relative incidenceof miR genes inside fragile sites occurs at a rate 9.12 times higherthan in non-fragile sites. Moreover, after studying 113 fragile sites ina human karyotype, it was found that 61 miR genes are located in thesame chromosomal band as a FRA. The distribution of miR genes in thehuman genome relative to FRAs is given in Table 5 and in Example 2.

As used herein, an miR gene is “associated” with a given FRA when themiR gene is located in close proximity to the FRA; i.e., when the miR islocated within the same chromosomal band or within 3 megabases (3 Mb) ofthe FRA. See Table 5 and Example 2 for a description of cancers whichare correlated with FRAs, and a description of miRs associated withthose FRAs.

For example, cancers associated with FRAs include bladder cancer,esophageal cancer, lung cancer, stomach cancer, kidney cancer, cervicalcancer, ovarian cancer, breast cancer, lymphoma, Ewing sarcoma,hematopoietic tumors, solid tumors and leukemia.

Examples of miR genes associated with FRAs include miR-186, miR-101-1,miR-194, miR-215, miR-106b, miR-25, miR-93, miR-29b, miR-29a, miR-96,miR-182s, miR-182 as, miR-183, miR-129-1, let7a-1, let-7d, let-7f-1,miR-23b, miR-24-1, miR-27b, miR-32, miR-159-1, miR-192, miR-125b-1,let-7a-2, miR-100, miR-196-2, miR-148b, miR-190, miR-21, miR-301,miR-142s, miR-142 as, miR-105-1, miR-175 and combinations thereof.

Specific groupings of miR gene(s) that are associated with a particularcancer and FRA are evident from Table 5, and are preferred. For example,FRA7H is correlated with esophageal cancer, and is associated withmiR-29b, miR-29a, miR-96, miR-182s, miR-182 as, miR-183, and miR-129-1.FRA9D is correlated with bladder cancer, and is associated with let7a-1,let-7d, let-7f-1, miR-23b, miR-24-1, and miR-27b. Where more than onemiR gene is listed in Table 5 in association with a FRA, it isunderstood that the cancer associated with those miR genes can bediagnosed by evaluating any one of the listed miR genes, or byevaluating any combination of the listed miR genes. Subgenera of CAGRsand/or associated with miR gene(s) would also be evident to one ofordinary skill in the art from Table 5.

Another class of cancer-associated chromosomal feature which is closelyassociated with miR genes in the human genome is a “human papillomavirus(HPV) integration site” (see Table 4 and Example 3). As used herein, an“HPV integration site” includes any site in a chromosome of a subjectwhere some or all of an HPV genome can insert into the genomic DNA, orany site where some or all of an HPV genome has inserted into thegenomic DNA. HPV integration sites are often associated with commonFRAs, but are distinct from FRAs for purposes of the present invention.Any species or strain of HPV can insert some or all of its genome intoan HPV integration site. However, the most common strains of HPV whichinsert some or all of their genomes into an HPV integration site are HPV16 and HPV 18. The identification of HPV integration sites in the humangenome is within the skill in the art; see, e.g., Thorland et al.(2000), Cancer Res. 60:5916-21, the entire disclosure of which is hereinincorporated by reference.

Thirteen miR genes (7%) are located within 2.5 Mb of seven of theseventeen (45%) cloned integration sites in the human genome. Therelative incidence of miRs at HPV16 integration sites occurred at a rate3.22 times higher than in the rest of the genome. Indeed, four miR genes(miR-21, miR-301, miR-142s and miR-142 as) were located within onecluster of integration sites at chromosome 17q23, in which there arethree HPV16 integration events spread over roughly 4 Mb of genomicsequence.

As used herein, an miR gene is “associated” with a given HPV integrationsite when the miR gene is located in close proximity to the HPVintegration site; i.e., when the miR is located within the samechromosomal band or within 3 megabases (3 Mb), preferably within 2.5 Mb,of the HPV integration site. See Table 5 and Example 3 for a descriptionof miRs associated with HPV integration sites.

Insertion of HPV sequences into the genome of subject is correlated withthe occurrence of cervical cancer. Examples of miR genes associated withHPV integration sites on human chromosomes include miR-21, miR-301,miR-142 as, miR-142s, miR-194, miR-215, miR-32 and combinations thereof.

Specific groupings of miR gene(s) that are associated with a particularHPV integration site are evident from Table 5, and are preferred. Forexample, the HPV integration site located in or near FRA9E is associatedwith miR-32. The HPV integration site located in or near FRA1H isassociated with miR-194 and miR-215. The HPV integration site located inor near FRA17B is associated with miR-21, miR-301, miR-142s, and miR-142as. Where more than one miR gene is listed in Table 5 in relation to anHPV integration site, it is understood that the cancer associated withthose miR genes can be diagnosed by evaluating any one of the listed miRgenes, or by evaluating any combination of the listed miR genes.

Another class of cancer-associated chromosomal feature which is closelyassociated with miR genes in the human genome is a “homeobox gene orgene cluster” (see Table 4 and Example 5). As used herein, a “homeoboxgene or gene cluster” is a single gene or a grouping of genes,characterized in that the gene or genes have been classified by sequenceor function as a class I or class II homeobox gene or contain the183-nucleotide “homeobox” sequence. Identification and characterizationof homeobox genes or gene clusters are within the skill in the art; see,e.g., Cillo et al. (1999), Exp. Cell Res. 248:1-9 and Pollard et al.(2000), Current Biology 10:1059-62, the entire disclosures of which areherein incorporated by reference.

Of the four known class I homeobox gene clusters in the human genome,three contain miR genes: miR-10a and miR-196-1 are in the HOX B clusteron 17q21; miR-196-2 is in the HOX C cluster at 12q13; and miR-10b is inthe HOX D cluster at 2q31. Three other miRs (miR-148, miR-152 andmiR-148b) are located within 1 Mb of a HOX gene cluster. miR genes arealso found within class II homeobox gene clusters; for example, sevenmicroRNAs (miR-129-1, miR-153-2, let-7a-1, let-7f-1, let-7d, miR-202 andmiR-139) are located within 0.5 Mb of class II homeotic genes. SeeExample 5 and FIG. 2 for a description of miRs associated with homeoboxgenes or gene clusters in the human genome.

Examples of homeobox genes associated with miR genes in the human genomeinclude genes in the HOXA cluster, genes in the HOXB cluster, genes inthe HOXC cluster, genes in the HOXD cluster, NK1, NK3, NK4, Lbx, Tlx,Emx, Vax, Hmx, NK6, Msx, Cdx, Xlox, Gsx, En, HB9, Gbx, Msx-1, Msx-2,GBX2, HLX, HEX, PMX1, DLX, LHX2 and CDX2. Examples of homeobox geneclusters associated with miR genes in the human genome include HOXA,HOXB, HOXC, HOXD, extended Hox, NKL, ParaHox, and EHGbox, PAX, PBX,MEIS, REIG and PREP/KNOX1.

Examples of cancers associated with homeobox genes or gene clustersinclude renal cancer, Wilm's tumor, colorectal cancer, small cell lungcancer, melanoma, breast cancer, prostate cancer, skin cancer,osteosarcoma, neuroblastoma, leukemia (acute lymphocytic leukemia, acutemyeloid leukemia, chronic lymphocytic leukemia), glioblastoma multiform,medulloblastoma, lymphoplasmacytoid lymphoma, thyroid cancer,rhabdomyosarcoma and solid tumors.

Examples of miR genes associated with homeobox genes or gene clustersinclude miR-148, miR-10a, miR-196-1, miR-152, miR-196-2, miR-148b,miR-10b, miR-129-1, miR-153-2, miR-202, miR-139, let-7a, let-7f, let-7dand combinations thereof.

Specific groupings of miR gene(s) that are associated with particularhomeobox genes or gene cluster are evident from Example 5 and FIG. 2,and are preferred. For example, homeobox gene cluster HOXA is associatedwith miR-148. Homeobox gene cluster HOXB is associated with miR-148,miR-10a, miR-196-1, miR-152 and combinations thereof. Homeobox genecluster HOXC is associated with miR-196-2, miR-148b or a combinationthereof. Homeobox gene cluster HOXD, is associated with miR-10b. Wheremore than one miR gene is associated with a homeobox gene or genecluster, it is understood that the cancer associated with those genescan be diagnosed by evaluating any one of the miR genes, or byevaluating any combination of the miR genes. In one embodiment, the miRgene or gene product that is measured or analyzed is not miR-15, miR-16,miR-143 and/or miR-145.

Without wishing to be bound by any theory, it is believed thatperturbations in the genomic structure or chromosomal architecture of acell which comprise the cancer-associated chromosomal feature can affectthe expression of the miR gene(s) associated with the feature in thatcell. For example, a CAGR can comprise an amplification of the regioncontaining an miR gene(s), causing an up-regulation of miR geneexpression. Likewise, the CAGR can comprise a chromosomal breakpoint ora deletion that disrupts gene expression, and results in adown-regulation of miR gene expression. HPV integrations and FRAs cancause deletions, amplifications or rearrangement of the surrounding DNA,which can also affect the structure or expression of any associated miRgenes. The factors which cause the collected dysregulation of homeoboxgenes or gene clusters would cause similar disruptions to any associatedmiR genes. A change in the status of at least one of the miR genesassociated with a cancer-associated chromosomal feature in a tissue orcell sample from a subject, relative to the status of that miR gene in acontrol sample, therefore is indicative of the presence of a cancer, ora susceptability to cancer, in a subject.

Without wishing to be bound by any theory, it is also believed that achange in status of miR genes associated with a cancer-associatedchromosomal feature can be detected prior to, or in the early stages of,the development of transformed or neoplastic phenotypes in cells of asubject. The invention therefore also provides a method of screeningsubjects for a predisposition to developing a cancer associated with acancer-associated chromosomal feature, by evaluating the status of atleast one miR gene associated with a cancer-associated chromosomalfeature in a tissue or cell sample from a subject, relative to thestatus of that miR gene in a control sample. Subjects with a change inthe status of one or more miR genes associated with a cancer-associatedchromosomal feature are candidates for further testing to determine orconfirm that the subjects have cancer. Such further testing can comprisehistological examination of blood or tissue samples, or other techniqueswithin the skill in the art.

As used herein, the “status of an miR gene” refers to the condition ofthe miR gene in terms of its physical sequence or structure, or itsability to express a gene product. Thus, the status of an miR gene incells of a subject can be evaluated by any technique suitable fordetecting genetic or epigenetic changes in the miR gene, or by anytechnique suitable for detecting the level of miR gene product producedfrom the miR gene.

For example, the level of at least one miR gene product produced from anmiR gene can be measured in cells of a biological sample obtained fromthe subject. An alteration in the level (i.e., an up- ordown-regulation) of miR gene product in the sample obtained from thesubject relative to the level of miR gene product in a control sample isindicative of the presence of the cancer in the subject. As used herein,a “subject” is any mammal suspected of having a cancer associated with acancer-associated chromosomal feature. In one embodiment, the subject isa human suspected of having a cancer associated with a cancer-associatedchromosomal feature. As used herein, expression of an miR gene is“up-regulated” when the amount of miR gene product produced from thatgene in a cell or tissue sample from a subject is greater than theamount produced from the same gene in a control cell or tissue sample.Likewise, expression of an miR gene is “down-regulated” when the amountof miR gene product produced from that gene in a cell or tissue samplefrom a subject is less than the amount produced from the same gene in acontrol cell or tissue sample.

Methods for determining RNA expression levels in cells from a biologicalsample are within the level of skill in the art. For example, tissuesample can be removed from a subject suspected of having cancerassociated with a cancer-associated chromosomal feature by conventionalbiopsy techniques. In another example, a blood sample can be removedfrom the subject, and white blood cells isolated for DNA extraction bystandard techniques. The blood or tissue sample is preferably obtainedfrom the subject prior to initiation of radiotherapy, chemotherapy orother therapeutic treatment. A corresponding control tissue or bloodsample can be obtained from unaffected tissues of the subject, from anormal human individual or population of normal individuals, or fromcultured cells corresponding to the majority of cells in the subject'ssample. The control tissue or blood sample is then processed along withthe sample from the subject, so that the levels of miR gene productproduced from a given miR gene in cells from the subject's sample can becompared to the corresponding miR gene product levels from cells of thecontrol sample.

For example, the relative miR gene expression in the control and normalsamples can be conveniently determined with respect to one or more RNAexpression standards. The standards can comprise, for example, a zeromiR gene expression level, the miR gene expression level in a standardcell line, or the average level of miR gene expression previouslyobtained for a population of normal human controls.

Suitable techniques for determining the level of RNA transcripts of aparticular gene in cells are within the skill in the art. According toone such method, total cellular RNA can be purified from cells byhomogenization in the presence of nucleic acid extraction buffer,followed by centrifugation. Nucleic acids are precipitated, and DNA isremoved by treatment with DNase and precipitation. The RNA molecules arethen separated by gel electrophoresis on agarose gels according tostandard techniques, and transferred to nitrocellulose filters by, e.g.,the so-called “Northern” blotting technique. The RNA is then immobilizedon the filters by heating. Detection and quantification of specific RNAis accomplished using appropriately labeled DNA or RNA probescomplementary to the RNA in question. See, for example, MolecularCloning: A Laboratory Manual, J. Sambrook et al., eds., 2nd edition,Cold Spring Harbor Laboratory Press, 1989, Chapter 7, the entiredisclosure of which is incorporated by reference.

Suitable probes for Northern blot hybridization of a given miR geneproduct can be produced from the nucleic acid sequences provided inTable 1. Methods for preparation of labeled DNA and RNA probes, and theconditions for hybridization thereof to target nucleotide sequences, aredescribed in Molecular Cloning: A Laboratory Manual, J. Sambrook et al.,eds., 2nd edition, Cold Spring Harbor Laboratory Press, 1989, Chapters10 and 11, the disclosures of which are herein incorporated byreference.

For example, the nucleic acid probe can be labeled with, e.g., aradionuclide such as ³H, ³²P, ³³P, ¹⁴C, or ³⁵S; a heavy metal; or aligand capable of functioning as a specific binding pair member for alabeled ligand (e.g., biotin, avidin or an antibody), a fluorescentmolecule, a chemiluminescent molecule, an enzyme or the like.

Probes can be labeled to high specific activity by either the nicktranslation method of Rigby et al. (1977), J. Mol. Biol. 113:237-251 orby the random priming method of Fienberg et al. (1983), Anal. Biochem.132:6-13, the entire disclosures of which are herein incorporated byreference. The latter is the method of choice for synthesizing³²P-labeled probes of high specific activity from single-stranded DNA orfrom RNA templates. For example, by replacing preexisting nucleotideswith highly radioactive nucleotides according to the nick translationmethod, it is possible to prepare ³²P-labeled nucleic acid probes with aspecific activity well in excess of 10⁸ cpm/microgram. Autoradiographicdetection of hybridization can then be performed by exposing hybridizedfilters to photographic film. Densitometric scanning of the photographicfilms exposed by the hybridized filters provides an accurate measurementof miR gene transcript levels. Using another approach, miR genetranscript levels can be quantified by computerized imaging systems,such the Molecular Dynamics 400-B 2D Phosphorimager available fromAmersham Biosciences, Piscataway, N.J.

Where radionuclide labeling of DNA or RNA probes is not practical, therandom-primer method can be used to incorporate an analogue, forexample, the dTTP analogue5-(N—(N-biotinyl-epsilon-aminocaproyl)-3-aminoallyl)deoxyuridinetriphosphate, into the probe molecule. The biotinylated probeoligonucleotide can be detected by reaction with biotin-bindingproteins, such as avidin, streptavidin, and antibodies (e.g.,anti-biotin antibodies) coupled to fluorescent dyes or enzymes thatproduce color reactions.

In addition to Northern and other RNA blotting hybridization techniques,determining the levels of RNA transcripts can be accomplished using thetechnique of in situ hybridization. This technique requires fewer cellsthan the Northern blotting technique, and involves depositing wholecells onto a microscope cover slip and probing the nucleic acid contentof the cell with a solution containing radioactive or otherwise labelednucleic acid (e.g., cDNA or RNA) probes. This technique is particularlywell-suited for analyzing tissue biopsy samples from subjects. Thepractice of the in situ hybridization technique is described in moredetail in U.S. Pat. No. 5,427,916, the entire disclosure of which isincorporated herein by reference. Suitable probes for in situhybridization of a given miR gene product can be produced from thenucleic acid sequences provided in Table 1, as described above.

The relative number of miR gene transcripts in cells can also bedetermined by reverse transcription of miR gene transcripts, followed byamplification of the reverse-transcribed transcripts by polymerase chainreaction (RT-PCR). The levels of miR gene transcripts can be quantifiedin comparison with an internal standard, for example, the level of mRNAfrom a “housekeeping” gene present in the same sample. A suitable“housekeeping” gene for use as an internal standard includes, e.g.,myosin or glyceraldehyde-3-phosphate dehydrogenase (G3PDH). The methodsfor quantitative RT-PCR and variations thereof are within the skill inthe art.

In some instances, it may be desirable to simultaneously determine theexpression level of a plurality of different of miR genes in a sample.In certain instances, it may be desirable to determine the expressionlevel of the transcripts of all known miR genes correlated with cancer.Assessing cancer-specific expression levels for hundreds of miR genes istime consuming and requires a large amount of total RNA (at least 20 μgfor each Northern blot) and autoradiographic techniques that requireradioactive isotopes. To overcome these limitations, an oligo library inmicrochip format may be constructed containing a set of probeoligonucleotides specific for a set of miR genes. In one embodiment, theoligo library contains probes corresponding to all known miRs from thehuman genome. The microchip oligolibrary may be expanded to includeadditional miRNAs as they are discovered.

The microchip is prepared from gene-specific oligonucleotide probesgenerated from known miRNAs. According to one embodiment, the arraycontains two different oligonucleotide probes for each miRNA, onecontaining the active sequence and the other being specific for theprecursor of the miRNA. The array may also contain controls such as oneor more mouse sequences differing from human orthologs by only a fewbases, which can serve as controls for hybridization stringencyconditions. tRNAs from both species may also be printed on themicrochip, providing an internal, relatively stable positive control forspecific hybridization. One or more appropriate controls fornon-specific hybridization may also be included on the microchip. Forthis purpose, sequences are selected based upon the absence of anyhomology with any known miRNAs.

The microchip may be fabricated by techniques known in the art. Forexample, probe oligonucleotides of an appropriate length, e.g., 40nucleotides, are 5′-amine modified at position C6 and printed usingcommercially available microarray systems, e.g., the GeneMachineOmniGrid™ 100 Microarrayer and Amersham CodeLink™ activated slides.Labeled cDNA oligomer corresponding to the target RNAs is prepared byreverse transcribing the target RNA with labeled primer. Following firststrand synthesis, the RNA/DNA hybrids are denatured to degrade the RNAtemplates. The labeled target cDNAs thus prepared are then hybridized tothe microarray chip under hybridizing conditions, e.g. 6×SSPE/30%formamide at 25° C. for 18 hours, followed by washing in 0.75×TNT at 37°C. for 40 minutes. At positions on the array where the immobilized probeDNA recognizes a complementary target cDNA in the sample, hybridizationoccurs. The labeled target cDNA marks the exact position on the arraywhere binding occurs, allowing automatic detection and quantification.The output consists of a list of hybridization events, indicating therelative abundance of specific cDNA sequences, and therefore therelative abundance of the corresponding complementary miRs, in thepatient sample. According to one embodiment, the labeled cDNA oligomeris a biotin-labeled cDNA, prepared from a biotin-labeled primer. Themicroarray is then processed by direct detection of thebiotin-containing transcripts using, e.g., Streptavidin-Alexa647conjugate, and scanned utilizing conventional scanning methods. Imagesintensities of each spot on the array are proportional to the abundanceof the corresponding miR in the patient sample.

The use of the array has several advantages for miRNA expressiondetection. First, the global expression of several hundred genes can beidentified in a same sample at one time point. Second, through carefuldesign of the oligonucleotide probes, expression of both mature andprecursor molecules can be identified. Third, in comparison withNorthern blot analysis, the chip requires a small amount of RNA, andprovides reproducible results using 2.5 μg of total RNA. The relativelylimited number of miRNAs (a few hundred per species) allows theconstruction of a common microarray for several species, with distinctoligonucleotide probes for each. Such a tool would allow for analysis oftrans-species expression for each known miR under various conditions.

In addition to use for quantitative expression level assays of specificmiRs, a microchip containing miRNA-specific probe oligonucleotidescorresponding to a substantial portion of the miRNome, preferably theentire miRNome, may be employed to carry out miR gene expressionprofiling, for analysis of miR expression patterns. Distinct miRsignatures may be associated with established disease markers, ordirectly with a disease state. As described hereinafter in Example 11,two distinct clusters of human B-cell chronic lymphocytic leukemia (CLL)samples are associated with the presence or the absence of Zap-70expression, a predictor of early disease progression. As described inExamples 11 and 12, two miRNA signatures were associated with thepresence of absence of prognostic markers of disease progression,including Zap-70 expression, mutations in the expressed immunoglobulinvariable-region gene IgV_(H) and deletions at 13q14. Therefore, miR geneexpression profiles can be used for diagnosing the disease state of acancer, such as whether a cancer is malignant or benign, based onwhether or not a given profile is representative of a cancer that isassociated with one or more established adverse prognostic markers.Prognostic markers that are suitable for this method include ZAP-70expression, unmutated IgV_(H) gene, CD38 expression, deletion atchromosome11q23, loss or mutation of TP53, and any combination thereof.

According to the expression profiling method in one embodiment, totalRNA from a sample from a subject suspected of having a cancer isquantitatively reverse transcribed to provide a set of labeled targetoligodeoxynucleotides complementary to the RNA in the sample. The targetoligodeoxynucleotides are then hybridized to a microarray comprisingmiRNA-specific probe oligonucleotides to provide a hybridization profilefor the sample. The result is a hybridization profile for the samplerepresenting the expression pattern of miRNA in the sample. Thehybridization profile comprises the signal from the binding of thetarget oligodeoxynucleotides from the sample to the miRNA-specific probeoligonucleotides in the microarray. The profile may be recorded as thepresence or absence of binding (signal vs. zero signal). Morepreferably, the profile recorded includes the intensity of the signalfrom each hybridization. The profile is compared to the hybridizationprofile generated from a normal, i.e., noncancerous, control sample. Analteration in the signal is indicative of the presence of the cancer inthe subject.

Other techniques for measuring miR gene expression are also within theskill in the art, and include various techniques for measuring rates ofRNA transcription and degradation.

The status of an miR gene in a cell of a subject can also be evaluatedby analyzing at least one miR gene or gene product in the sample for adeletion, mutation or amplification, wherein detection of a deletion,mutation or amplification in the miR gene or gene product relative tothe miR gene or gene product in a control sample is indicative of thepresence of the cancer in the subject. As used herein, a mutation is anyalteration in the sequence of a gene of interest that results from oneor more nucleotide changes. Such changes include, but are not limitedto, allelic polymorphisms, and may affect gene expression and/orfunction of the gene product.

A deletion, mutation or amplification in an miR gene or gene product canbe detected by determining the structure or sequence of an miR gene orgene product in cells from a biological sample from a subject suspectedof having cancer associated with a cancer-associated chromosomalfeature, and comparing this with the structure or sequence of acorresponding gene or gene product in cells from a control sample.Subject and control samples can be obtained as described herein.Especially suitable candidate miR genes for this type of analysisinclude, but are not limited to, miR-16-1, miR-27b, miR-206, miR-29b-2and miR-187. As described in Examples 13 and 14 herein, specificmutations in these five miR genes have been identified in samples fromCLL patients.

In certain embodiments, the present invention provides methods fordiagnosing whether a subject has, or is at risk for developing, acancer, comprising analyzing a miR gene or gene product in a test samplefrom the subject, wherein the detection of a mutation in the miR gene orgene product in the test sample, relative to a control sample, isindicative of the subject having, or being at risk for developing,cancer. In one embodiment, the method comprises analyzing the status ofa miR-16-1 gene or gene product. In a particular embodiment, the methodcomprises analyzing the status of a miR-16-1 gene for the presence of amutation, wherein the mutation is a C to T nucleotide substitution at +7base pairs 3′ of the miR-16-1 precursor coding region (see, e.g., SEQ IDNOS:641 and 642). Suitable cancers to be diagnosed by this methodinclude CLL, among others. In another embodiment, the method comprisesanalyzing the status of a miR-27b gene or gene product. In a particularembodiment, the method comprises analyzing the status of a miR-27b genefor the presence of a mutation, wherein the mutation is a G to Anucleotide substitution at +50 base pairs 3′ of the miR-27b precursorcoding region (see, e.g., SEQ ID NOS:645 and 646). Suitable cancers tobe diagnosed by this method include, but are not limited to, CLL, throatcancer, and lung cancer. In an additional embodiment, the methodcomprises analyzing the status of a miR-206 gene or gene product. In aparticular embodiment, the method comprises analyzing the status of amiR-206 gene for the presence of a mutation, wherein the mutation is a Gto T nucleotide substitution at position 49 of the miR-206 precursorcoding region (see, e.g., SEQ ID NOS:657 and 658). In a relatedembodiment, the method comprises analyzing the status of a miR-206 genefor the presence of a mutation, wherein the mutation is an A to Tsubstitution at −116 base pairs 5′ of the miR-206 precursor codingregion (see, e.g., SEQ ID NOS:657 and 659). Suitable cancers to bediagnosed by this method include, but are not limited to, CLL and otherleukemias, esophogeal cancer, prostate cancer and breast cancer. In yetanother embodiment the method comprises analyzing the status of amiR-29b-2 gene or gene product. In a particular embodiment, the methodcomprises analyzing the status of a miR-29b-2 gene for the presence of amutation, wherein the mutation is a G to A nucleotide substitution at+212 base pairs 3′ of the miR-29b-2 precursor coding region (see, e.g.,SEQ ID NOS:651 and 652). In a related embodiment, the method comprisesanalyzing the status of a miR-206 gene for the presence of a mutation,wherein the mutation is an A nucleotide insertion at +107 base pairs 3′of the miR-29b-2 precursor coding region (see, e.g., SEQ ID NOS:651 and653). Suitable cancers to be diagnosed by this method include, but arenot limited to, CLL and other leukemias, as well as breast cancer. In afurther embodiment, the method comprises analyzing the status of amiR-187 gene or gene product. In a particular embodiment, the methodcomprises analyzing the status of a miR-187 gene for the presence of amutation, wherein the mutation is a T to C nucleotide substitution at+73 base pairs 3′ of the miR-187 precursor coding region (see, e.g., SEQID NOS:654 and 655). Suitable cancers to be diagnosed by this methodinclude, CLL, among others.

Any technique suitable for detecting alterations in the structure orsequence of genes can be used in the practice of the present method. Forexample, the presence of miR gene deletions, mutations or amplificationscan be detected by Southern blot hybridization of the genomic DNA from asubject, using nucleic acid probes specific for miR gene sequences.

Southern blot hybridization techniques are within the skill in the art.For example, genomic DNA isolated from a subject's sample can bedigested with restriction endonucleases. This digestion generatesrestriction fragments of the genomic DNA that can be separated byelectrophoresis, for example, on an agarose gel. The restrictionfragments are then blotted onto a hybridization membrane (e.g.,nitrocellulose or nylon), and hybridized with labeled probes specificfor a given miR gene or genes. A deletion or mutation of these genes isindicated by an alteration of the restriction fragment patterns on thehybridization membrane, as compared to DNA from a control sample thathas been treated identically to the DNA from the subject's sample. Probelabeling and hybridization conditions suitable for detecting alterationsin gene structure or sequence can be readily determined by one ofordinary skill in the art. The miR gene nucleic acid probes for Southernblot hybridization can be designed based upon the nucleic acid sequencesprovided in Table 1, as described herein. Nucleic acid probehybridization can then be detected by exposing hybridized filters tophotographic film, or by employing computerized imaging systems, suchthe Molecular Dynamics 400-B 2D Phosphorimager available from AmershamBiosciences, Piscataway, N.J.

Deletions, mutations and/or amplifications of an miR gene can also bedetected by amplifying a fragment of these genes by polymerase chainreaction (PCR), and analyzing the amplified fragment by sequencing or byelectrophoresis to determine if the sequence and/or length of theamplified fragment from the subject's DNA sample is different from thatof a control DNA sample. Suitable reaction and cycling conditions forPCR amplification of DNA fragments can be readily determined by one ofordinary skill in the art.

Deletions of an miR gene can also be identified by detecting deletionsof chromosomal markers that are closely linked to the miR gene.Mutations in an miR gene can also be detected by the technique of singlestrand conformational polymorphism (SSCP), for example, as described inOrita et al. (1989), Genomics 5:874-879 and Hayashi (1991), PCR Methodsand Applic. 1:34-38, the entire disclosures of which are hereinincorporated by reference. The SSCP technique consists of amplifying afragment of the gene of interest by PCR; denaturing the fragment andelectrophoresing the two denatured single strands under non-denaturingconditions. The single strands assume a complex sequence-dependentintrastrand secondary structure that affects the strands electrophoreticmobility.

The status of an miR gene in cells of a subject can also be evaluated bymeasuring the copy number of the at least one miR gene in the sample,wherein a gene copy number other than two for miR genes on somaticchromosomes and sex chromosomes in a female, or other than one for miRgenes on sex chromosomes in a male, is indicative of the presence of thecancer in the subject.

Any technique suitable for detecting gene copy number can be used in thepractice of the present method, including the Southern blot and PCRamplification techniques described above. An alternative method ofdetermining the miR gene copy number in a sample of tissue relies on thefact that many miR genes or gene clusters are closely linked tochromosomal markers or other genes. The loss of a copy of an miR gene inan individual who is heterozygous at a marker or gene closely linked tothe miR gene can be inferred from the loss of heterozygosity in theclosely linked marker or gene. Methods for determining loss ofheterozygosity of chromosomal markers are within the skill in the art.

As discussed above, the human miR genes are closely associated withdifferent classes of chromosomal features that are themselves associatedwith cancer. These cancers are likely caused, in part, by theperturbation in the chromosome or genomic DNA caused by thecancer-associated chromosomal feature, which can affect expression ofoncogenes or tumor-suppressor genes located near the site ofperturbation. Without wishing to be bound by any theory, it is believedthat the perturbations caused by the cancer-associated chromosomalfeatures also affect the expression level of miR genes associated withthe feature, and that this also may also contribute to cancerigenesis.Therefore, a given cancer can be treated by restoring the level of miRgene expression associated with that cancer to normal. For example, ifthe level of miR gene expression is down-regulated in cancer cells of asubject, then the cancer can be treated by raising the miR expressionlevel. Likewise, if the level of miR gene expression is up-regulated incancer cells of a subject, then the cancer can be treated by reducingthe miR expression level.

The cancers associated with different cancer-associated chromosomalfeatures, and the miR genes associated with these features, aredescribed above and in Tables 5, 6 and 7 and FIG. 2. In the practice ofthe present method, expression the appropriate miR gene or genesassociated with a particular cancer and/or cancer-associated chromosomalfeatures is altered by the compositions and methods described herein. Asbefore, specific groupings of miR gene(s) that are associated with aparticular cancer-associated chromosomal feature and/or cancer areevident from Tables 5, 6 and 7 and in FIG. 2, and are preferred. In oneembodiment, the method of treatment comprising administering an miR geneproduct. In another embodiment, the method of treatment comprisesadministering an miR gene product, provided the miR gene product is notmiR-15, miR-16, miR-143 and/or miR-145.

In one embodiment of the present method, the level of at least one miRgene product in cancer cells of a subject is first determined relativeto control cells. Techniques suitable for determining the relative levelof miR gene product in cells are described above. If miR gene expressionis down-regulated in the cancer cell relative to control cells, then thecancer cells are treated with an effective amount of a compoundcomprising the isolated miR gene product from the miR gene which isdown-regulated. If miR gene expression is up-regulated in cancer cellsrelative to control cells, then the cancer cells are treated with aneffective amount of a compound that inhibits miR gene expression. In oneembodiment, the level of miR gene product in a cancer cell is notdetermined beforehand, for example, in those cancers where miR geneexpression is known to be up- or down-regulated.

Thus, in the practice of the present treatment methods, an effectiveamount of at least one isolated miR gene product can be administered toa subject. As used herein, an “effective amount” of an isolated miR geneproduct is an amount sufficient to inhibit proliferation of a cancercell in a subject suffering from a cancer associated with acancer-associated chromosomal feature. One skilled in the art canreadily determine an effective amount of an miR gene product to beadministered to a given subject, by taking into account factors such asthe size and weight of the subject; the extent of disease penetration;the age, health and sex of the subject; the route of administration; andwhether the administration is regional or systemic.

For example, an effective amount of isolated miR gene product can bebased on the approximate weight of a tumor mass to be treated. Theapproximate weight of a tumor mass can be determined by calculating theapproximate volume of the mass, wherein one cubic centimeter of volumeis roughly equivalent to one gram. An effective amount of the isolatedmiR gene product based on the weight of a tumor mass can be at leastabout 10 micrograms/gram of tumor mass, and is preferably between about10-500 micrograms/gram of tumor mass. More preferably, the effectiveamount is at least about 60 micrograms/gram of tumor mass. Particularlypreferably, the effective amount is at least about 100 micrograms/gramof tumor mass. It is preferred that an effective amount based on theweight of the tumor mass be injected directly into the tumor.

An effective amount of an isolated miR gene product can also be based onthe approximate or estimated body weight of a subject to be treated.Preferably, such effective amounts are administered parenterally orenterally, as described herein. For example, an effective amount of theisolated miR gene product is administered to a subject can range fromabout 5-3000 micrograms/kg of body weight, and is preferably betweenabout 700-1000 micrograms/kg of body weight, and is more preferablygreater than about 1000 micrograms/kg of body weight.

One skilled in the art can also readily determine an appropriate dosageregimen for the administration of an isolated miR gene product to agiven subject. For example, an miR gene product can be administered tothe subject once (e.g., as a single injection or deposition).Alternatively, an miR gene product can be administered once or twicedaily to a subject for a period of from about three to abouttwenty-eight days, more preferably from about seven to about ten days.In a preferred dosage regimen, an miR gene product is administered oncea day for seven days. Where a dosage regimen comprises multipleadministrations, it is understood that the effective amount of the miRgene product administered to the subject can comprise the total amountof gene product administered over the entire dosage regimen.

As used herein, an “isolated” miR gene product is one which issynthesized, or altered or removed from the natural state through humanintervention. For example, an miR gene product naturally present in aliving animal is not “isolated.” A synthetic miR gene product, or an miRgene product partially or completely separated from the coexistingmaterials of its natural state, is “isolated.” An isolated miR geneproduct can exist in substantially purified form, or can exist in a cellinto which the miR gene product has been delivered. Thus, an miR geneproduct which is deliberately delivered to, or expressed in, a cell isconsidered an “isolated” miR gene product. An miR gene product producedinside a cell by from an miR precursor molecule is also considered to be“isolated” molecule.

Isolated miR gene products can be obtained using a number of standardtechniques. For example, the miR gene products can be chemicallysynthesized or recombinantly produced using methods known in the art.Preferably, miR gene products are chemically synthesized usingappropriately protected ribonucleoside phosphoramidites and aconventional DNA/RNA synthesizer. Commercial suppliers of synthetic RNAmolecules or synthesis reagents include, e.g., Proligo (Hamburg,Germany), Dharmacon Research (Lafayette, Colo., USA), Pierce Chemical(part of Perbio Science, Rockford, Ill., USA), Glen Research (Sterling,Va., USA), ChemGenes (Ashland, Mass., USA) and Cruachem (Glasgow, UK).

Alternatively, the miR gene products can be expressed from recombinantcircular or linear DNA plasmids using any suitable promoter. Suitablepromoters for expressing RNA from a plasmid include, e.g., the U6 or H1RNA pol III promoter sequences, or the cytomegalovirus promoters.Selection of other suitable promoters is within the skill in the art.The recombinant plasmids of the invention can also comprise inducible orregulatable promoters for expression of the miR gene products in cancercells.

The miR gene products that are expressed from recombinant plasmids canbe isolated from cultured cell expression systems by standardtechniques. The miR gene products which are expressed from recombinantplasmids can also be delivered to, and expressed directly in, the cancercells. The use of recombinant plasmids to deliver the miR gene productsto cancer cells is discussed in more detail below.

The miR gene products can be expressed from a separate recombinantplasmid, or can be expressed from the same recombinant plasmid.Preferably, the miR gene products are expressed as the RNA precursormolecules from a single plasmid, and the precursor molecules areprocessed into the functional miR gene product by a suitable processingsystem, including processing systems extant within a cancer cell. Othersuitable processing systems include, e.g., the in vitro Drosophila celllysate system as described in U.S. published application 2002/0086356 toTuschl et al. and the E. coli RNAse III system described in U.S.published patent application 2004/0014113 to Yang et al., the entiredisclosures of which are herein incorporated by reference.

Selection of plasmids suitable for expressing the miR gene products,methods for inserting nucleic acid sequences into the plasmid to expressthe gene products, and methods of delivering the recombinant plasmid tothe cells of interest are within the skill in the art. See, for example,Zeng et al. (2002), Molecular Cell 9:1327-1333; Tuschl (2002), Nat.Biotechnol, 20:446-448; Brummelkamp et al. (2002), Science 296:550-553;Miyagishi et al. (2002), Nat. Biotechnol. 20:497-500; Paddison et al.(2002), Genes Dev. 16:948-958; Lee et al. (2002), Nat. Biotechnol.20:500-505; and Paul et al. (2002), Nat. Biotechnol. 20:505-508, theentire disclosures of which are herein incorporated by reference.

In one embodiment, a plasmid expressing the miR gene products comprisesa sequence encoding a miR precursor RNA under the control of the CMVintermediate-early promoter. As used herein, “under the control” of apromoter means that the nucleic acid sequences encoding the miR geneproduct are located 3′ of the promoter, so that the promoter caninitiate transcription of the miR gene product coding sequences.

The miR gene products can also be expressed from recombinant viralvectors. It is contemplated that the miR gene products can be expressedfrom two separate recombinant viral vectors, or from the same viralvector. The RNA expressed from the recombinant viral vectors can eitherbe isolated from cultured cell expression systems by standardtechniques, or can be expressed directly in cancer cells. The use ofrecombinant viral vectors to deliver the miR gene products to cancercells is discussed in more detail below.

The recombinant viral vectors of the invention comprise sequencesencoding the miR gene products and any suitable promoter for expressingthe RNA sequences. Suitable promoters include, for example, the U6 or H1RNA pol III promoter sequences, or the cytomegalovirus promoters.Selection of other suitable promoters is within the skill in the art.The recombinant viral vectors of the invention can also compriseinducible or regulatable promoters for expression of the miR geneproducts in a cancer cell.

Any viral vector capable of accepting the coding sequences for the miRgene products can be used; for example, vectors derived from adenovirus(AV); adeno-associated virus (AAV); retroviruses (e.g., lentiviruses(LV), Rhabdoviruses, murine leukemia virus); herpes virus, and the like.The tropism of the viral vectors can be modified by pseudotyping thevectors with envelope proteins or other surface antigens from otherviruses, or by substituting different viral capsid proteins, asappropriate.

For example, lentiviral vectors of the invention can be pseudotyped withsurface proteins from vesicular stomatitis virus (VSV), rabies, Ebola,Mokola, and the like. AAV vectors of the invention can be made to targetdifferent cells by engineering the vectors to express different capsidprotein serotypes. For example, an AAV vector expressing a serotype 2capsid on a serotype 2 genome is called AAV 2/2. This serotype 2 capsidgene in the AAV 2/2 vector can be replaced by a serotype 5 capsid geneto produce an AAV 2/5 vector. Techniques for constructing AAV vectorswhich express different capsid protein serotypes are within the skill inthe art; see, e.g., Rabinowitz J. E. et al. (2002), J Virol 76:791-801,the entire disclosure of which is herein incorporated by reference.

Selection of recombinant viral vectors suitable for use in theinvention, methods for inserting nucleic acid sequences for expressingRNA into the vector, methods of delivering the viral vector to the cellsof interest, and recovery of the expressed RNA products are within theskill in the art. See, for example, Dornburg (1995), Gene Therap.2:301-310; Eglitis (1988), Biotechniques 6:608-614; Miller (1990), Hum.Gene Therap. 1:5-14; and Anderson (1998), Nature 392:25-30, the entiredisclosures of which are herein incorporated by reference.

Preferred viral vectors are those derived from AV and AAV. A suitable AVvector for expressing the miR gene products, a method for constructingthe recombinant AV vector, and a method for delivering the vector intotarget cells, are described in Xia et al. (2002), Nat. Biotech.20:1006-1010, the entire disclosure of which is herein incorporated byreference. Suitable AAV vectors for expressing the miR gene products,methods for constructing the recombinant AAV vector, and methods fordelivering the vectors into target cells are described in Samulski etal. (1987), J. Virol. 61:3096-3101; Fisher et al. (1996), J. Virol.,70:520-532; Samulski et al. (1989), J. Virol. 63:3822-3826; U.S. Pat.No. 5,252,479; U.S. Pat. No. 5,139,941; International Patent ApplicationNo. WO 94/13788; and International Patent Application No. WO 93/24641,the entire disclosures of which are herein incorporated by reference.Preferably, the miR gene products are expressed from a singlerecombinant AAV vector comprising the CMV intermediate early promoter.

In one embodiment, a recombinant AAV viral vector of the inventioncomprises a nucleic acid sequence encoding an miR precursor RNA inoperable connection with a polyT termination sequence under the controlof a human U6 RNA promoter. As used herein, “in operable connection witha polyT termination sequence” means that the nucleic acid sequencesencoding the sense or antisense strands are immediately adjacent to thepolyT termination signal in the 5′ direction. During transcription ofthe miR sequences from the vector, the polyT termination signals act toterminate transcription.

In the practice of the present treatment methods, an effective amount ofat least one compound which inhibits miR gene expression can also beadministered to the subject. As used herein, “inhibiting miR geneexpression” means that the production of miR gene product from the miRgene in the cancer cell after treatment is less than the amount producedprior to treatment. One skilled in the art can readily determine whethermiR gene expression has been inhibited in a cancer cell, using forexample the techniques for determining miR transcript level discussedabove for the diagnostic method.

As used herein, an “effective amount” of a compound that inhibits miRgene expression is an amount sufficient to inhibit proliferation of acancer cell in a subject suffering from a cancer associated with acancer-associated chromosomal feature. One skilled in the art canreadily determine an effective amount of an miR geneexpression-inhibiting compound to be administered to a given subject, bytaking into account factors such as the size and weight of the subject;the extent of disease penetration; the age, health and sex of thesubject; the route of administration; and whether the administration isregional or systemic.

For example, an effective amount of the expression-inhibiting compoundcan be based on the approximate weight of a tumor mass to be treated.The approximate weight of a tumor mass can be determined by calculatingthe approximate volume of the mass, wherein one cubic centimeter ofvolume is roughly equivalent to one gram. An effective amount based onthe weight of a tumor mass can be at least about 10 micrograms/gram oftumor mass, and is preferably between about 10-500 micrograms/gram oftumor mass. More preferably, the effective amount is at least about 60micrograms/gram of tumor mass. Particularly preferably, the effectiveamount is at least about 100 micrograms/gram of tumor mass. It ispreferred that an effective amount based on the weight of the tumor massbe injected directly into the tumor.

An effective amount of a compound that inhibits miR gene expression canalso be based on the approximate or estimated body weight of a subjectto be treated. Preferably, such effective amounts are administeredparenterally or enterally, as described herein. For example, aneffective amount of the expression-inhibiting compound administered to asubject can range from about 5-3000 micrograms/kg of body weight, and ispreferably between about 700-1000 micrograms/kg of body weight, and ismore preferably greater than about 1000 micrograms/kg of body weight.

One skilled in the art can also readily determine an appropriate dosageregimen for administering a compound that inhibits miR gene expressionto a given subject. For example, an expression-inhibiting compound canbe administered to the subject once (e.g., as a single injection ordeposition). Alternatively, an expression-inhibiting compound can beadministered once or twice daily to a subject for a period of from aboutthree to about twenty-eight days, more preferably from about seven toabout ten days. In a preferred dosage regimen, an expression-inhibitingcompound is administered once a day for seven days. Where a dosageregimen comprises multiple administrations, it is understood that theeffective amount of the expression-inhibiting compound administered tothe subject can comprise the total amount of compound administered overthe entire dosage regimen.

Suitable compounds for inhibiting miR gene expression includedouble-stranded RNA (such as short- or small-interfering RNA or“siRNA”), antisense nucleic acids, and enzymatic RNA molecules such asribozymes. Each of these compounds can be targeted to a given miR geneproduct and destroy or induce the destruction of the target miR geneproduct.

For example, expression of a given miR gene can be inhibited by inducingRNA interference of the miR gene with an isolated double-stranded RNA(“dsRNA”) molecule which has at least 90%, for example 95%, 98%, 99% or100%, sequence homology with at least a portion of the miR gene product.In a preferred embodiment, the dsRNA molecule is a “short or smallinterfering RNA” or “siRNA.”

siRNA useful in the present methods comprise short double-stranded RNAfrom about 17 nucleotides to about 29 nucleotides in length, preferablyfrom about 19 to about 25 nucleotides in length. The siRNA comprise asense RNA strand and a complementary antisense RNA strand annealedtogether by standard Watson-Crick base-pairing interactions (hereinafter“base-paired”). The sense strand comprises a nucleic acid sequence whichis substantially identical to a nucleic acid sequence contained withinthe target miR gene product.

As used herein, a nucleic acid sequence in an siRNA which is“substantially identical” to a target sequence contained within thetarget mRNA is a nucleic acid sequence that is identical to the targetsequence, or that differs from the target sequence by one or twonucleotides. The sense and antisense strands of the siRNA can comprisetwo complementary, single-stranded RNA molecules, or can comprise asingle molecule in which two complementary portions are base-paired andare covalently linked by a single-stranded “hairpin” area.

The siRNA can also be altered RNA that differs from naturally-occurringRNA by the addition, deletion, substitution and/or alteration of one ormore nucleotides. Such alterations can include addition ofnon-nucleotide material, such as to the end(s) of the siRNA or to one ormore internal nucleotides of the siRNA, or modifications that make thesiRNA resistant to nuclease digestion, or the substitution of one ormore nucleotides in the siRNA with deoxyribonucleotides.

One or both strands of the siRNA can also comprise a 3′ overhang. Asused herein, a “3′ overhang” refers to at least one unpaired nucleotideextending from the 3′-end of a duplexed RNA strand. Thus, in oneembodiment, the siRNA comprises at least one 3′ overhang of from 1 toabout 6 nucleotides (which includes ribonucleotides ordeoxyribonucleotides) in length, preferably from 1 to about 5nucleotides in length, more preferably from 1 to about 4 nucleotides inlength, and particularly preferably from about 2 to about 4 nucleotidesin length. In a preferred embodiment, the 3′ overhang is present on bothstrands of the siRNA, and is 2 nucleotides in length. For example, eachstrand of the siRNA can comprise 3′ overhangs of dithymidylic acid(“TT”) or diuridylic acid (“uu”).

The siRNA can be produced chemically or biologically, or can beexpressed from a recombinant plasmid or viral vector, as described abovefor the isolated miR gene products. Exemplary methods for producing andtesting dsRNA or siRNA molecules are described in U.S. published patentapplication 2002/0173478 to Gewirtz and in U.S. published patentapplication 2004/0018176 to Reich et al., the entire disclosures ofwhich are herein incorporated by reference.

Expression of a given miR gene can also be inhibited by an antisensenucleic acid. As used herein, an “antisense nucleic acid” refers to anucleic acid molecule that binds to target RNA by means of RNA-RNA orRNA-DNA or RNA-peptide nucleic acid interactions, which alters theactivity of the target RNA. Antisense nucleic acids suitable for use inthe present methods are single-stranded nucleic acids (e.g., RNA, DNA,RNA-DNA chimeras, PNA) that generally comprise a nucleic acid sequencecomplementary to a contiguous nucleic acid sequence in an miR geneproduct. Preferably, the antisense nucleic acid comprises a nucleic acidsequence that is 50-100% complementary, more preferably 75-100%complementary, and most preferably 95-100% complementary to a contiguousnucleic acid sequence in an miR gene product. Nucleic acid sequences forthe miR gene products are provided in Table 1. Without wishing to bebound by any theory, it is believed that the antisense nucleic acidsactivate RNase H or some other cellular nuclease that digests the miRgene product/antisense nucleic acid duplex.

Antisense nucleic acids can also contain modifications to the nucleicacid backbone or to the sugar and base moieties (or their equivalent) toenhance target specificity, nuclease resistance, delivery or otherproperties related to efficacy of the molecule. Such modificationsinclude cholesterol moieties, duplex intercalators such as acridine orthe inclusion of one or more nuclease-resistant groups.

Antisense nucleic acids can be produced chemically or biologically, orcan be expressed from a recombinant plasmid or viral vector, asdescribed above for the isolated miR gene products. Exemplary methodsfor producing and testing are within the skill in the art; see, e.g.,Stein and Cheng (1993), Science 261:1004 and U.S. Pat. No. 5,849,902 toWoolf et al., the entire disclosures of which are herein incorporated byreference.

Expression of a given miR gene can also be inhibited by an enzymaticnucleic acid. As used herein, an “enzymatic nucleic acid” refers to anucleic acid comprising a substrate binding region that hascomplementarity to a contiguous nucleic acid sequence of an miR geneproduct, and which is able to specifically cleave the miR gene product.Preferably, the enzymatic nucleic acid substrate binding region is50-100% complementary, more preferably 75-100% complementary, and mostpreferably 95-100% complementary to a contiguous nucleic acid sequencein an miR gene product. The enzymatic nucleic acids can also comprisemodifications at the base, sugar, and/or phosphate groups. An exemplaryenzymatic nucleic acid for use in the present methods is a ribozyme.

The enzymatic nucleic acids can be produced chemically or biologically,or can be expressed from a recombinant plasmid or viral vector, asdescribed above for the isolated miR gene products. Exemplary methodsfor producing and testing dsRNA or siRNA molecules are described inWerner and Uhlenbeck (1995), Nucl. Acids Res. 23:2092-96; Hammann et al.(1999), Antisense and Nucleic Acid Drug Dev. 9:25-31; and U.S. Pat. No.4,987,071 to Cech et al, the entire disclosures of which are hereinincorporated by reference.

Administration of at least one miR gene product, or at least onecompound for inhibiting miR gene expression, will inhibit theproliferation of cancer cells in a subject who has a cancer associatedwith a cancer-associated chromosomal feature. As used herein, to“inhibit the proliferation of a cancer cell” means to kill the cell, orpermanently or temporarily arrest or slow the growth of the cell.Inhibition of cancer cell proliferation can be inferred if the number ofsuch cells in the subject remains constant or decreases afteradministration of the miR gene products or miR geneexpression-inhibiting compounds. An inhibition of cancer cellproliferation can also be inferred if the absolute number of such cellsincreases, but the rate of tumor growth decreases.

The number of cancer cells in a subject's body can be determined bydirect measurement, or by estimation from the size of primary ormetastatic tumor masses. For example, the number of cancer cells in asubject can be measured by immunohistological methods, flow cytometry,or other techniques designed to detect characteristic surface markers ofcancer cells.

The size of a tumor mass can be ascertained by direct visualobservation, or by diagnostic imaging methods, such as X-ray, magneticresonance imaging, ultrasound, and scintigraphy. Diagnostic imagingmethods used to ascertain size of the tumor mass can be employed with orwithout contrast agents, as is known in the art. The size of a tumormass can also be ascertained by physical means, such as palpation of thetissue mass or measurement of the tissue mass with a measuringinstrument, such as a caliper.

The miR gene products or miR gene expression-inhibiting compounds can beadministered to a subject by any means suitable for delivering thesecompounds to cancer cells of the subject. For example, the miR geneproducts or miR expression inhibiting compounds can be administered bymethods suitable to transfect cells of the subject with these compounds,or with nucleic acids comprising sequences encoding these compounds.Preferably, the cells are transfected with a plasmid or viral vectorcomprising sequences encoding at least one miR gene product or miR geneexpression inhibiting compound.

Transfection methods for eukaryotic cells are well known in the art, andinclude, e.g., direct injection of the nucleic acid into the nucleus orpronucleus of a cell; electroporation; liposome transfer or transfermediated by lipophilic materials; receptor mediated nucleic aciddelivery, bioballistic or particle acceleration; calcium phosphateprecipitation, and transfection mediated by viral vectors.

For example, cells can be transfected with a liposomal transfercompound, e.g.,DOTAP(N-[1-(2,3-dioleoyloxy)propyl]-N,N,N-trimethyl-ammoniummethylsulfate, Boehringer-Mannheim) or an equivalent, such asLIPOFECTIN. The amount of nucleic acid used is not critical to thepractice of the invention; acceptable results may be achieved with0.1-100 micrograms of nucleic acid/10⁵ cells. For example, a ratio ofabout 0.5 micrograms of plasmid vector in 3 micrograms of DOTAP per 10⁵cells can be used.

An miR gene product or miR gene expression inhibiting compound can alsobe administered to a subject by any suitable enteral or parenteraladministration route. Suitable enteral administration routes for thepresent methods include, e.g., oral, rectal, or intranasal delivery.Suitable parenteral administration routes include, e.g., intravascularadministration (e.g., intravenous bolus injection, intravenous infusion,intra-arterial bolus injection, intra-arterial infusion and catheterinstillation into the vasculature); peri- and intra-tissue injection(e.g., peri-tumoral and intra-tumoral injection, intra-retinalinjection, or subretinal injection); subcutaneous injection ordeposition, including subcutaneous infusion (such as by osmotic pumps);direct application to the tissue of interest, for example by a catheteror other placement device (e.g., a retinal pellet or a suppository or animplant comprising a porous, non-porous, or gelatinous material); andinhalation. Preferred administration routes are injection, infusion anddirect injection into the tumor.

In the present methods, an miR gene product or miR gene expressioninhibiting compound can be administered to the subject either as nakedRNA, in combination with a delivery reagent, or as a nucleic acid (e.g.,a recombinant plasmid or viral vector) comprising sequences that expressthe miR gene product or expression inhibiting compound. Suitabledelivery reagents include, e.g, the Minis Transit TKO lipophilicreagent; lipofectin; lipofectamine; cellfectin; polycations (e.g.,polylysine), and liposomes.

Recombinant plasmids and viral vectors comprising sequences that expressthe miR gene products or miR gene expression inhibiting compounds, andtechniques for delivering such plasmids and vectors to cancer cells, arediscussed above.

In a preferred embodiment, liposomes are used to deliver an miR geneproduct or miR gene expression-inhibiting compound (or nucleic acidscomprising sequences encoding them) to a subject. Liposomes can alsoincrease the blood half-life of the gene products or nucleic acids.

Liposomes suitable for use in the invention can be formed from standardvesicle-forming lipids, which generally include neutral or negativelycharged phospholipids and a sterol, such as cholesterol. The selectionof lipids is generally guided by consideration of factors such as thedesired liposome size and half-life of the liposomes in the bloodstream. A variety of methods are known for preparing liposomes, forexample, as described in Szoka et al. (1980), Ann. Rev. Biophys. Bioeng.9:467; and U.S. Pat. Nos. 4,235,871, 4,501,728, 4,837,028, and5,019,369, the entire disclosures of which are herein incorporated byreference.

The liposomes for use in the present methods can comprise a ligandmolecule that targets the liposome to cancer cells. Ligands which bindto receptors prevalent in cancer cells, such as monoclonal antibodiesthat bind to tumor cell antigens, are preferred.

The liposomes for use in the present methods can also be modified so asto avoid clearance by the mononuclear macrophage system (“MMS”) andreticuloendothelial system (“RES”). Such modified liposomes haveopsonization-inhibition moieties on the surface or incorporated into theliposome structure. In a particularly preferred embodiment, a liposomeof the invention can comprise both opsonization-inhibition moieties anda ligand.

Opsonization-inhibiting moieties for use in preparing the liposomes ofthe invention are typically large hydrophilic polymers that are bound tothe liposome membrane. As used herein, an opsonization inhibiting moietyis “bound” to a liposome membrane when it is chemically or physicallyattached to the membrane, e.g., by the intercalation of a lipid-solubleanchor into the membrane itself, or by binding directly to active groupsof membrane lipids. These opsonization-inhibiting hydrophilic polymersform a protective surface layer that significantly decreases the uptakeof the liposomes by the MMS and RES; e.g., as described in U.S. Pat. No.4,920,016, the entire disclosure of which is herein incorporated byreference.

Opsonization inhibiting moieties suitable for modifying liposomes arepreferably water-soluble polymers with a number-average molecular weightfrom about 500 to about 40,000 daltons, and more preferably from about2,000 to about 20,000 daltons. Such polymers include polyethylene glycol(PEG) or polypropylene glycol (PPG) derivatives; e.g., methoxy PEG orPPG, and PEG or PPG stearate; synthetic polymers such as polyacrylamideor poly N-vinyl pyrrolidone; linear, branched, or dendrimericpolyamidoamines; polyacrylic acids; polyalcohols, e.g., polyvinylalcoholand polyxylitol to which carboxylic or amino groups are chemicallylinked, as well as gangliosides, such as ganglioside GM1. Copolymers ofPEG, methoxy PEG, or methoxy PPG, or derivatives thereof, are alsosuitable. In addition, the opsonization inhibiting polymer can be ablock copolymer of PEG and either a polyamino acid, polysaccharide,polyamidoamine, polyethyleneamine, or polynucleotide. The opsonizationinhibiting polymers can also be natural polysaccharides containing aminoacids or carboxylic acids, e.g., galacturonic acid, glucuronic acid,mannuronic acid, hyaluronic acid, pectic acid, neuraminic acid, alginicacid, carrageenan; aminated polysaccharides or oligosaccharides (linearor branched); or carboxylated polysaccharides or oligosaccharides, e.g.,reacted with derivatives of carbonic acids with resultant linking ofcarboxylic groups. Preferably, the opsonization-inhibiting moiety is aPEG, PPG, or derivatives thereof. Liposomes modified with PEG orPEG-derivatives are sometimes called “PEGylated liposomes.”

The opsonization inhibiting moiety can be bound to the liposome membraneby any one of numerous well-known techniques. For example, anN-hydroxysuccinimide ester of PEG can be bound to aphosphatidyl-ethanolamine lipid-soluble anchor, and then bound to amembrane. Similarly, a dextran polymer can be derivatized with astearylamine lipid-soluble anchor via reductive amination usingNa(CN)BH₃ and a solvent mixture, such as tetrahydrofuran and water in a30:12 ratio at 60° C.

Liposomes modified with opsonization-inhibition moieties remain in thecirculation much longer than unmodified liposomes. For this reason, suchliposomes are sometimes called “stealth” liposomes. Stealth liposomesare known to accumulate in tissues fed by porous or “leaky”microvasculature. Thus, tissue characterized by such microvasculaturedefects, for example solid tumors, will efficiently accumulate theseliposomes; see Gabizon, et al. (1988), Proc. Natl. Acad. Sci., USA,18:6949-53. In addition, the reduced uptake by the RES lowers thetoxicity of stealth liposomes by preventing significant accumulation ofthe liposomes in the liver and spleen. Thus, liposomes that are modifiedwith opsonization-inhibition moieties are particularly suited to deliverthe miR gene products or miR gene expression inhibition compounds (ornucleic acids comprising sequences encoding them) to tumor cells.

The miR gene products or miR gene expression inhibition compounds arepreferably formulated as pharmaceutical compositions, sometimes called“medicaments,” prior to administering to a subject, according totechniques known in the art. Pharmaceutical compositions of the presentinvention are characterized as being at least sterile and pyrogen-free.As used herein, “pharmaceutical formulations” include formulations forhuman and veterinary use. Methods for preparing pharmaceuticalcompositions of the invention are within the skill in the art, forexample as described in Remington's Pharmaceutical Science, 17th ed.,Mack Publishing Company, Easton, Pa. (1985), the entire disclosure ofwhich is herein incorporated by reference.

The present pharmaceutical formulations comprise at least one miR geneproduct or miR gene expression inhibition compound (or at least onenucleic acid comprising sequences encoding them) (e.g., 0.1 to 90% byweight), or a physiologically acceptable salt thereof, mixed with apharmaceutically-acceptable carrier. The pharmaceutical formulations ofthe invention can also comprise at least one miR gene product or miRgene expression inhibition compound (or at least one nucleic acidcomprising sequences encoding them) which are encapsulated by liposomesand a pharmaceutically-acceptable carrier. In one embodiment, thepharmaceutical compositions comprise an miR gene or gene product that isnot miR-15, miR-16, miR-143 and/or miR-145.

Preferred pharmaceutically-acceptable carriers are water, bufferedwater, normal saline, 0.4% saline, 0.3% glycine, hyaluronic acid and thelike.

In a preferred embodiment, the pharmaceutical compositions of theinvention comprise at least one miR gene product or miR gene expressioninhibition compound (or at least one nucleic acid comprising sequencesencoding them) which is resistant to degradation by nucleases. Oneskilled in the art can readily synthesize nucleic acids which arenuclease resistant, for example by incorporating one or moreribonucleotides that are modified at the 2′-position into the miR geneproducts. Suitable 2′-modified ribonucleotides include those modified atthe 2′-position with fluoro, amino, alkyl, alkoxy, and O-allyl.

Pharmaceutical compositions of the invention can also compriseconventional pharmaceutical excipients and/or additives. Suitablepharmaceutical excipients include stabilizers, antioxidants, osmolalityadjusting agents, buffers, and pH adjusting agents. Suitable additivesinclude, e.g., physiologically bio compatible buffers (e.g.,tromethamine hydrochloride), additions of chelants (such as, forexample, DTPA or DTPA-bisamide) or calcium chelate complexes (such as,for example, calcium DTPA, CaNaDTPA-bisamide), or, optionally, additionsof calcium or sodium salts (for example, calcium chloride, calciumascorbate, calcium gluconate or calcium lactate). Pharmaceuticalcompositions of the invention can be packaged for use in liquid form, orcan be lyophilized.

For solid pharmaceutical compositions of the invention, conventionalnontoxic solid pharmaceutically-acceptable carriers can be used; forexample, pharmaceutical grades of mannitol, lactose, starch, magnesiumstearate, sodium saccharin, talcum, cellulose, glucose, sucrose,magnesium carbonate, and the like.

For example, a solid pharmaceutical composition for oral administrationcan comprise any of the carriers and excipients listed above and 10-95%,preferably 25%-75%, of the at least one miR gene product or miR geneexpression inhibition compound (or at least one nucleic acid comprisingsequences encoding them). A pharmaceutical composition for aerosol(inhalational) administration can comprise 0.01-20% by weight,preferably 1%-10% by weight, of the at least one miR gene product or miRgene expression inhibition compound (or at least one nucleic acidcomprising sequences encoding them) encapsulated in a liposome asdescribed above, and a propellant. A carrier can also be included asdesired; e.g., lecithin for intranasal delivery.

The invention will now be illustrated by the following non-limitingexamples.

EXAMPLES

The following techniques were used in the Examples.

General Methods:

The miR Gene Database

A set of 187 human miR genes was compiled (see Table 1). The setcomprises 153 miRs identified in the miR Registry (maintained by theWellcome Trust Sanger Institute, Cambridge, UK), and 36 other miRsmanually curated from published papers (Lim et al., 2003, Science299:1540; Lagos-Quintana et al., 2001, Science 294:853-858; Lau et al.,2001, Science 294:858-862; Lee et al., 2001, Science 294:862-864;Mourelatos et al., 2002, Genes Dev. 16:720-728; Lagos-Quintana et al.,2002, Curr. Biol. 12:735-739; Dostie et al., 2003, RNA 9:180-186;Houbaviy et al., 2003, Dev. Cell. 5:351-8) or found in the GenBankdatabase accessed through the National Center for BiotechnologyInformation (NCBI) website, maintained by the National Institutes ofHealth and the National Library of Medicine Nineteen new human miRs(approximately 10% of the miR set) were found based on their homologywith cloned miRs from other species (mainly mouse). For all miRs, thesequence of the precursor was identified using the M Zucker RNA foldingprogram and selecting the precursor sequence that gave the best scorefor the hairpin structure. The program is available and is maintained byMichael Zucker of Rensselaer Polytechnic Institute.

TABLE 1 Human miR Gene Product Sequences SEQ ID Name Precursor Sequence(5′ to 3′)* NO. has-let-7a-1-prec CACTGTGGGATGAGGTAGTAGGTTGTATAGTTTTAGG1 GTCACACCCACCACTGGGAGATAACTATACAATCTAC TGTCTTTCCTAACGTGhsa-let-7a-2-prec AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACAT 2CAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT hsa-let-7a-3-precGGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCC 3CTGCTATGGGATAACTATACAATCTACTGTCTTTCCT hsa-let-7a-4-precGTGACTGCATGCTCCCAGGTTGAGGTAGTAGGTTGTA 4TAGTTTAGAATTACACAAGGGAGATAACTGTACAGCC TCCTAGCTTTCCTTGGGTCTTGCACTAAACAAChsa-let-7b-prec GGCGGGGTGAGGTAGTAGGTTGTGTGGTTTCAGGGCA 5GTGATGTTGCCCCTCGGAAGATAACTATACAACCTAC TGCCTTCCCTG hsa-let-7c-precGCATCCGGGTTGAGGTAGTAGGTTGTATGGTTTAGAG 6TTACACCCTGGGAGTTAACTGTACAACCTTCTAGCTTT CCTTGGAGC hsa-let-7d-precCCTAGGAAGAGGTAGTAGGTTGCATAGTTTTAGGGCA 7GGGATTTTGCCCACAAGGAGGTAACTATACGACCTGC TGCCTTTCTTAGG hsa-let-7d-v1-precCTAGGAAGAGGTAGTAGTTTGCATAGTTTTAGGGCAA 8AGATTTTGCCCACAAGTAGTTAGCTATACGACCTGCA GCCTTTTGTAG hsa-let-7d-v2-precCTGGCTGAGGTAGTAGTTTGTGCTGTTGGTCGGGTTGT 9GACATTGCCCGCTGTGGAGATAACTGCGCAAGCTACT GCCTTGCTAG hsa-let-7e-precCCCGGGCTGAGGTAGGAGGTTGTATAGTTGAGGAGGA 10CACCCAAGGAGATCACTATACGGCCTCCTAGCTTTCCC CAGG hsa-let-7f-1-precTCAGAGTGAGGTAGTAGATTGTATAGTTGTGGGGTAG 11TGATTTTACCCTGTTCAGGAGATAACTATACAATCTAT TGCCTTCCCTGA hsa-let-7f-2-precCTGTGGGATGAGGTAGTAGATTGTATAGTTGTGGGGT 12AGTGATTTTACCCTGTTCAGGAGATAACTATACAATCT ATTGCCTTCCCTGA hsa-let-7f-2-precCTGTGGGATGAGGTAGTAGATTGTATAGTTTTAGGGT 13CATACCCCATCTTGGAGATAACTATACAGTCTACTGTC TTTCCCACGG hsa-let-7g-precTTGCCTGATTCCAGGCTGAGGTAGTAGTTTGTACAGTT 14TGAGGGTCTATGATACCACCCGGTACAGGAGATAACT GTACAGGCCACTGCCTTGCCAGGAACAGCGCGChsa-let-7i-prec CTGGCTGAGGTAGTAGTTTGTGCTGTTGGTCGGGTTGT 15GACATTGCCCGCTGTGGAGATAACTGCGCAAGCTACT GCCTTGCTAG hsa-mir-001b-1-ACCTACTCAGAGTACATACTTCTTTATGTACCCATATG 16 precAACATACAATGCTATGGAATGTAAAGAAGTATGTATT TTTGGTAGGC hsa-mir-001b-1-CAGCTAACAACTTAGTAATACCTACTCAGAGTACATA 17 precCTTCTTTATGTACCCATATGAACATACAATGCTATGGA ATGTAAAGAAGTATGTATTTTTGGTAGGCAATAhsa-mir-001b-2- GCCTGCTTGGGAAACATACTTCTTTATATGCCCATATG 18 precGACCTGCTAAGCTATGGAATGTAAAGAAGTATGTATC TCAGGCCGGG hsa-mir-001b-TGGGAAACATACTTCTTTATATGCCCATATGGACCTGC 19 precTAAGCTATGGAATGTAAAGAAGTATGTATCTCA hsa-mir-001d-ACCTACTCAGAGTACATACTTCTTTATGTACCCATATG 20 precAACATACAATGCTATGGAATGTAAAGAAGTATGTATT TTTGGTAGGC hsa-mir-007-1TGGATGTTGGCCTAGTTCTGTGTGGAAGACTAGTGATT 21TTGTTGTTTTTAGATAACTAAATCGACAACAAATCACA GTCTGCCATATGGCACAGGCCATGCCTCTACAhsa-mir-007-1- TTGGATGTTGGCCTAGTTCTGTGTGGAAGACTAGTGAT 22 precTTTGTTGTTTTTAGATAACTAAATCGACAACAAATCACAGTCTGCCATATGGCACAGGCCATGCCTCTACAG hsa-mir-007-2CTGGATACAGAGTGGACCGGCTGGCCCCATCTGGAAG 23ACTAGTGATTTTGTTGTTGTCTTACTGCGCTCAACAACAAATCCCAGTCTACCTAATGGTGCCAGCCATCGCA hsa-mir-007-2-CTGGATACAGAGTGGACCGGCTGGCCCCATCTGGAAG 24 precACTAGTGATTTTGTTGTTGTCTTACTGCGCTCAACAACAAATCCCAGTCTACCTAATGGTGCCAGCCATCGCA hsa-mir-007-3AGATTAGAGTGGCTGTGGTCTAGTGCTGTGTGGAAGA 25CTAGTGATTTTGTTGTTCTGATGTACTACGACAACAAGTCACAGCCGGCCTCATAGCGCAGACTCCCTTCGAC hsa-mir-007-3-AGATTAGAGTGGCTGTGGTCTAGTGCTGTGTGGAAGA 26 precCTAGTGATTTTGTTGTTCTGATGTACTACGACAACAAGTCACAGCCGGCCTCATAGCGCAGACTCCCTTCGAC hsa-mir-009-1CGGGGTTGGTTGTTATCTTTGGTTATCTAGCTGTATGA 27GTGGTGTGGAGTCTTCATAAAGCTAGATAACCGAAAG TAAAAATAACCCCA hsa-mir-009-2GGAAGCGAGTTGTTATCTTTGGTTATCTAGCTGTATGA 28GTGTATTGGTCTTCATAAAGCTAGATAACCGAAAGTA AAAACTCCTTCA hsa-mir-009-3GGAGGCCCGTTTCTCTCTTTGGTTATCTAGCTGTATGA 29GTGCCACAGAGCCGTCATAAAGCTAGATAACCGAAAG TAGAAATGATTCTCA hsa-mir-010a-precGATCTGTCTGTCTTCTGTATATACCCTGTAGATCCGAA 30TTTGTGTAAGGAATTTTGTGGTCACAAATTCGTATCTAGGGGAATATGTAGTTGACATAAACACTCCGCTCT hsa-mir-010b-CCAGAGGTTGTAACGTTGTCTATATATACCCTGTAGAA 31 precCCGAATTTGTGTGGTATCCGTATAGTCACAGATTCGATTCTAGGGGAATATATGGTCGATGCAAAAACTTCA hsa-mir-015a-2-GCGCGAATGTGTGTTTAAAAAAAATAAAACCTTGGAG 32 precTAAAGTAGCAGCACATAATGGTTTGTGGATTTTGAAA AGGTGCAGGCCATATTGTGCTGCCTCAAAAATAChsa-mir-015a-prec CCTTGGAGTAAAGTAGCAGCACATAATGGTTTGTGGA 33TTTTGAAAAGGTGCAGGCCATATTGTGCTGCCTCAAA AATACAAGG hsa-mir-015b-CTGTAGCAGCACATCATGGTTTACATGCTACAGTCAA 34 precGATGCGAATCATTATTTGCTGCTCTAG hsa-mir-015b-TTGAGGCCTTAAAGTACTGTAGCAGCACATCATGGTTT 35 precACATGCTACAGTCAAGATGCGAATCATTATTTGCTGCT CTAGAAATTTAAGGAAATTCAThsa-mir-016a- GTCAGCAGTGCCTTAGCAGCACGTAAATATTGGCGTT 36 chr13AAGATTCTAAAATTATCTCCAGTATTAACTGTGCTGCT GAAGTAAGGTTGAC hsa-mir-016b-GTTCCACTCTAGCAGCACGTAAATATTGGCGTAGTGA 37 chr3AATATATATTAAACACCAATATTACTGTGCTGCTTTAG TGTGAC hsa-mir-016-prec-GCAGTGCCTTAGCAGCACGTAAATATTGGCGTTAAGA 38 13TTCTAAAATTATCTCCAGTATTAACTGTGCTGCTGAAG TAAGGT hsa-mir-017-precGTCAGAATAATGTCAAAGTGCTTACAGTGCAGGTAGT 39GATATGTGCATCTACTGCAGTGAAGGCACTTGTAGCA TTATGGTGAC hsa-mir-018-precTGTTCTAAGGTGCATCTAGTGCAGATAGTGAAGTAGA 40TTAGCATCTACTGCCCTAAGTGCTCCTTCTGGCA hsa-mir-018-prec-TTTTTGTTCTAAGGTGCATCTAGTGCAGATAGTGAAGT 41 13AGATTAGCATCTACTGCCCTAAGTGCTCCTTCTGGCAT AAGAA hsa-mir-019a-precGCAGTCCTCTGTTAGTTTTGCATAGTTGCACTACAAGA 42AGAATGTAGTTGTGCAAATCTATGCAAAACTGATGGT GGCCTGC hsa-mir-019a-CAGTCCTCTGTTAGTTTTGCATAGTTGCACTACAAGAA 43 prec-13GAATGTAGTTGTGCAAATCTATGCAAAACTGATGGTG GCCTG hsa-mir-019b-1-CACTGTTCTATGGTTAGTTTTGCAGGTTTGCATCCAGC 44 precTGTGTGATATTCTGCTGTGCAAATCCATGCAAAACTGA CTGTGGTAGTG hsa-mir-019b-2-ACATTGCTACTTACAATTAGTTTTGCAGGTTTGCATTT 45 precCAGCGTATATATGTATATGTGGCTGTGCAAATCCATGC AAAACTGATTGTGATAATGThsa-mir-019b- TTCTATGGTTAGTTTTGCAGGTTTGCATCCAGCTGTGT 46 prec-13GATATTCTGCTGTGCAAATCCATGCAAAACTGACTGT GGTAG hsa-mir-019b-TTACAATTAGTTTTGCAGGTTTGCATTTCAGCGTATAT 47 prec-XATGTATATGTGGCTGTGCAAATCCATGCAAAACTGAT TGTGAT hsa-mir-020-precGTAGCACTAAAGTGCTTATAGTGCAGGTAGTGTTTAG 48TTATCTACTGCATTATGAGCACTTAAAGTACTGC hsa-mir-021-precTGTCGGGTAGCTTATCAGACTGATGTTGACTGTTGAAT 49CTCATGGCAACACCAGTCGATGGGCTGTCTGACA hsa-mir-021-prec-ACCTTGTCGGGTAGCTTATCAGACTGATGTTGACTGTT 50 17GAATCTCATGGCAACACCAGTCGATGGGCTGTCTGAC ATTTTG hsa-mir-022-precGGCTGAGCCGCAGTAGTTCTTCAGTGGCAAGCTTTAT 51GTCCTGACCCAGCTAAAGCTGCCAGTTGAAGAACTGT TGCCCTCTGCC hsa-mir-023a-precGGCCGGCTGGGGTTCCTGGGGATGGGATTTGCTTCCT 52GTCACAAATCACATTGCCAGGGATTTCCAACCGACC hsa-mir-023b-CTCAGGTGCTCTGGCTGCTTGGGTTCCTGGCATGCTGA 53 precTTTGTGACTTAAGATTAAAATCACATTGCCAGGGATTA CCACGCAACCACGACCTTGGChsa-mir-023-prec- CCACGGCCGGCTGGGGTTCCTGGGGATGGGATTTGCT 54 19TCCTGTCACAAATCACATTGCCAGGGATTTCCAACCG ACCCTGA hsa-mir-024-1-CTCCGGTGCCTACTGAGCTGATATCAGTTCTCATTTTA 55 precCACACTGGCTCAGTTCAGCAGGAACAGGAG hsa-mir-024-2-CTCTGCCTCCCGTGCCTACTGAGCTGAAACACAGTTGG 56 precTTTGTGTACACTGGCTCAGTTCAGCAGGAACAGGG hsa-mir-024-prec-CCCTGGGCTCTGCCTCCCGTGCCTACTGAGCTGAAACA 57 19CAGTTGGTTTGTGTACACTGGCTCAGTTCAGCAGGAA CAGGGG hsa-mir-024-prec-9CCCTCCGGTGCCTACTGAGCTGATATCAGTTCTCATTT 58TACACACTGGCTCAGTTCAGCAGGAACAGCATC hsa-mir-025-precGGCCAGTGTTGAGAGGCGGAGACTTGGGCAATTGCTG 59GACGCTGCCCTGGGCATTGCACTTGTCTCGGTCTGACA GTGCCGGCC hsa-mir-026a-precAGGCCGTGGCCTCGTTCAAGTAATCCAGGATAGGCTG 60TGCAGGTCCCAATGGCCTATCTTGGTTACTTGCACGGG GACGCGGGCCT hsa-mir-026b-CCGGGACCCAGTTCAAGTAATTCAGGATAGGTTGTGT 61 precGCTGTCCAGCCTGTTCTCCATTACTTGGCTCGGGGACC GG hsa-mir-027a-precCTGAGGAGCAGGGCTTAGCTGCTTGTGAGCAGGGTCC 62ACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCCC CCAG hsa-mir-027b-AGGTGCAGAGCTTAGCTGATTGGTGAACAGTGATTGG 63 precTTTCCGCTTTGTTCACAGTGGCTAAGTTCTGCACCT hsa-mir-027b-ACCTCTCTAACAAGGTGCAGAGCTTAGCTGATTGGTG 64 precAACAGTGATTGGTTTCCGCTTTGTTCACAGTGGCTAAG TTCTGCACCTGAAGAGAAGGTGhsa-mir-027-prec- CCTGAGGAGCAGGGCTTAGCTGCTTGTGAGCAGGGTC 65 19CACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCC CCCAGG hsa-mir-028-precGGTCCTTGCCCTCAAGGAGCTCACAGTCTATTGAGTTA 66CCTTTCTGACTTTCCCACTAGATTGTGAGCTCCTGGAG GGCAGGCACT hsa-mir-029a-2CCTTCTGTGACCCCTTAGAGGATGACTGATTTCTTTTG 67GTGTTCAGAGTCAATATAATTTTCTAGCACCATCTGAA ATCGGTTATAATGATTGGGGAAGAGCACCATGhsa-mir-029a-prec ATGACTGATTTCTTTTGGTGTTCAGAGTCAATATAATT 68TTCTAGCACCATCTGAAATCGGTTAT hsa-mir-029c-precACCACTGGCCCATCTCTTACACAGGCTGACCGATTTCT 69CCTGGTGTTCAGAGTCTGTTTTTGTCTAGCACCATTTGAAATCGGTTATGATGTAGGGGGAAAAGCAGCAGC hsa-mir-030a-precGCGACTGTAAACATCCTCGACTGGAAGCTGTGAAGCC 70ACAGATGGGCTTTCAGTCGGATGTTTGCAGCTGC hsa-mir-030b-ATGTAAACATCCTACACTCAGCTGTAATACATGGATT 71 prec GGCTGGGAGGTGGATGTTTACGThsa-mir-030b- ACCAAGTTTCAGTTCATGTAAACATCCTACACTCAGCT 72 precGTAATACATGGATTGGCTGGGAGGTGGATGTTTACTT CAGCTGACTTGGA hsa-mir-030c-precAGATACTGTAAACATCCTACACTCTCAGCTGTGGAAA 73GTAAGAAAGCTGGGAGAAGGCTGTTTACTCTTTCT hsa-mir-030d-GTTGTTGTAAACATCCCCGACTGGAAGCTGTAAGACA 74 precCAGCTAAGCTTTCAGTCAGATGTTTGCTGCTAC hsa-mir-031-precGGAGAGGAGGCAAGATGCTGGCATAGCTGTTGAACTG 75GGAACCTGCTATGCCAACATATTGCCATCTTTCC hsa-mir-032-precGGAGATATTGCACATTACTAAGTTGCATGTTGTCACG 76GCCTCAATGCAATTTAGTGTGTGTGATATTTTC hsa-mir-033b-GGGGGCCGAGAGAGGCGGGCGGCCCCGCGGTGCATT 77 precGCTGTTGCATTGCACGTGTGTGAGGCGGGTGCAGTGCCTCGGCAGTGCAGCCCGGAGCCGGCCCCTGGCACCAC hsa-mir-033-precCTGTGGTGCATTGTAGTTGCATTGCATGTTCTGGTGGT 78ACCCATGCAATGTTTCCACAGTGCATCACAG hsa-mir-034-precGGCCAGCTGTGAGTGTTTCTTTGGCAGTGTCTTAGCTG 79GTTGTTGTGAGCAATAGTAAGGAAGCAATCAGCAAGTATACTGCCCTAGAAGTGCTGCACGTTGTGGGGCCC hsa-mir-091-prec-TCAGAATAATGTCAAAGTGCTTACAGTGCAGGTAGTG 80 13ATATGTGCATCTACTGCAGTGAAGGCACTTGTAGCATT ATGGTGA hsa-mir-092-prec-CTTTCTACACAGGTTGGGATCGGTTGCAATGCTGTGTT 81 13 = 092-1TCTGTATGGTATTGCACTTGTCCCGGCCTGTTGAGTTT GG hsa-mir-092-prec-TCATCCCTGGGTGGGGATTTGTTGCATTACTTGTGTTC 82 X = 092-2TATATAAAGTATTGCACTTGTCCCGGCCTGTGGAAGA hsa-mir-093-prec-CTGGGGGCTCCAAAGTGCTGTTCGTGCAGGTAGTGTG 83 7.1 = 093-1ATTACCCAACCTACTGCTGAGCTAGCACTTCCCGAGCC CCCGG hsa-mir-093-prec-CTGGGGGCTCCAAAGTGCTGTTCGTGCAGGTAGTGTG 84 7.2 = 093-2ATTACCCAACCTACTGCTGAGCTAGCACTTCCCGAGCC CCCGG hsa-mir-095-prec-4AACACAGTGGGCACTCAATAAATGTCTGTTGAATTGA 85AATGCGTTACATTCAACGGGTATTTATTGAGCACCCAC TCTGTG hsa-mir-096-prec-7TGGCCGATTTTGGCACTAGCACATTTTTGCTTGTGTCT 86CTCCGCTCTGAGCAATCATGTGCAGTGCCAATATGGG AAA hsa-mir-098-prec-XGTGAGGTAGTAAGTTGTATTGTTGTGGGGTAGGGATA 87TTAGGCCCCAATTAGAAGATAACTATACAACTTACTA CTTTCC hsa-mir-099b-GGCACCCACCCGTAGAACCGACCTTGCGGGGCCTTCG 88 prec-19CCGCACACAAGCTCGTGTCTGTGGGTCCGTGTC hsa-mir-099-prec-CCCATTGGCATAAACCCGTAGATCCGATCTTGTGGTG 89 21AAGTGGACCGCACAAGCTCGCTTCTATGGGTCTGTGT CAGTGTG hsa-mir-100-1/2-AAGAGAGAAGATATTGAGGCCTGTTGCCACAAACCCG 90 precTAGATCCGAACTTGTGGTATTAGTCCGCACAAGCTTGT ATCTATAGGTATGTGTCTGTTAGGCAATCTCAChsa-mir-100-prec- CCTGTTGCCACAAACCCGTAGATCCGAACTTGTGGTAT 91 11TAGTCCGCACAAGCTTGTATCTATAGGTATGTGTCTGT TAGG hsa-mir-101-1/2-AGGCTGCCCTGGCTCAGTTATCACAGTGCTGATGCTGT 92 precCTATTCTAAAGGTACAGTACTGTGATAACTGAAGGATGGCAGCCATCTTACCTTCCATCAGAGGAGCCTCAC hsa-mir-101-precTCAGTTATCACAGTGCTGATGCTGTCCATTCTAAAGGT 93 ACAGTACTGTGATAACTGAhsa-mir-101-prec-1 TGCCCTGGCTCAGTTATCACAGTGCTGATGCTGTCTAT 94TCTAAAGGTACAGTACTGTGATAACTGAAGGATGGCA hsa-mir-101-prec-9TGTCCTTTTTCGGTTATCATGGTACCGATGCTGTATAT 95CTGAAAGGTACAGTACTGTGATAACTGAAGAATGGTG hsa-mir-102-prec-1CTTCTGGAAGCTGGTTTCACATGGTGGCTTAGATTTTT 96CCATCTTTGTATCTAGCACCATTTGAAATCAGTGTTTT AGGAG hsa-mir-102-prec-CTTCAGGAAGCTGGTTTCATATGGTGGTTTAGATTTAA 97 7.1ATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTT GGGGG hsa-mir-102-prec-CTTCAGGAAGCTGGTTTCATATGGTGGTTTAGATTTAA 98 7.2ATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTT GGGGG hsa-mir-103-2-TTGTGCTTTCAGCTTCTTTACAGTGCTGCCTTGTAGCA 99 precTTCAGGTCAAGCAACATTGTACAGGGCTATGAAAGAA CCA hsa-mir-103-prec-TTGTGCTTTCAGCTTCTTTACAGTGCTGCCTTGTAGCA 100 20TTCAGGTCAAGCAACATTGTACAGGGCTATGAAAGAA CCA hsa-mir-103-prec-TACTGCCCTCGGCTTCTTTACAGTGCTGCCTTGTTGCA 101 5 = 103-1TATGGATCAAGCAGCATTGTACAGGGCTATGAAGGCA TTG hsa-mir-104-prec-AAATGTCAGACAGCCCATCGACTGGTGTTGCCATGAG 102 17ATTCAACAGTCAACATCAGTCTGATAAGCTACCCGAC AAGG hsa-mir-105-prec-TGTGCATCGTGGTCAAATGCTCAGACTCCTGTGGTGGC 103 X.1 = 105-1TGCTCATGCACCACGGATGTTTGAGCATGTGCTACGGT GTCTA hsa-mir-105-prec-TGTGCATCGTGGTCAAATGCTCAGACTCCTGTGGTGGC 104 X.2 = 105-2TGCTCATGCACCACGGATGTTTGAGCATGTGCTACGGT GTCTA hsa-mir-106-prec-XCCTTGGCCATGTAAAAGTGCTTACAGTGCAGGTAGCT 105TTTTGAGATCTACTGCAATGTAAGCACTTCTTACATTA CCATGG hsa-mir-107-prec-CTCTCTGCTTTCAGCTTCTTTACAGTGTTGCCTTGTGGC 106 10ATGGAGTTCAAGCAGCATTGTACAGGGCTATCAAAGC ACAGA hsa-mir-122a-precCCTTAGCAGAGCTGTGGAGTGTGACAATGGTGTTTGT 107GTCTAAACTATCAAACGCCATTATCACACTAAATAGC TACTGCTAGGC hsa-mir-122a-precAGCTGTGGAGTGTGACAATGGTGTTTGTGTCCAAACT 108 ATCAAACGCCATTATCACACTAAATAGCThsa-mir-123-prec ACATTATTACTTTTGGTACGCGCTGTGACACTTCAAAC 109TCGTACCGTGAGTAATAATGCGC hsa-mir-124a-1-tccttcctCAGGAGAAAGGCCTCTCTCTCCGTGTTCACAGC 110 precGGACCTTGATTTAAATGTCCATACAATTAAGGCACGC GGTGAATGCCAAGAATGGGGCThsa-mir-124a-1- AGGCCTCTCTCTCCGTGTTCACAGCGGACCTTGATTTA 111 precAATGTCCATACAATTAAGGCACGCGGTGAATGCCAAG AATGGGGCTG hsa-mir-124a-2-ATCAAGATTAGAGGCTCTGCTCTCCGTGTTCACAGCG 112 precGACCTTGATTTAATGTCATACAATTAAGGCACGCGGTGAATGCCAAGAGCGGAGCCTACGGCTGCACTTGAAG hsa-mir-124a-3-CCCGCCCCAGCCCTGAGGGCCCCTCTGCGTGTTCACA 113 precGCGGACCTTGATTTAATGTCTATACAATTAAGGCACGCGGTGAATGCCAAGAGAGGCGCCTCCGCCGCTCCTT hsa-mir-124a-3-TGAGGGCCCCTCTGCGTGTTCACAGCGGACCTTGATTT 114 precAATGTCTATACAATTAAGGCACGCGGTGAATGCCAAG AGAGGCGCCTCC hsa-mir-124a-precCTCTGCGTGTTCACAGCGGACCTTGATTTAATGTCTAT 115ACAATTAAGGCACGCGGTGAATGCCAAGAG hsa-mir-124b-CTCTCCGTGTTCACAGCGGACCTTGATTTAATGTCATA 116 precCAATTAAGGCACGCGGTGAATGCCAAGAG hsa-mir-125a-precTGCCAGTCTCTAGGTCCCTGAGACCCTTTAACCTGTGA 117GGACATCCAGGGTCACAGGTGAGGTTCTTGGGAGCCT GGCGTCTGGCC hsa-mir-125a-precGGTCCCTGAGACCCTTTAACCTGTGAGGACATCCAGG 118 GTCACAGGTGAGGTTCTTGGGAGCCTGGhsa-mir-125b-1 ACATTGTTGCGCTCCTCTCAGTCCCTGAGACCCTAACT 119TGTGATGTTTACCGTTTAAATCCACGGGTTAGGCTCTT GGGAGCTGCGAGTCGTGCTTTTGCATCCTGGAhsa-mir-125b-1 TGCGCTCCTCTCAGTCCCTGAGACCCTAACTTGTGATG 120TTTACCGTTTAAATCCACGGGTTAGGCTCTTGGGAGCT GCGAGTCGTGCT hsa-mir-125b-2-ACCAGACTTTTCCTAGTCCCTGAGACCCTAACTTGTGA 121 precGGTATTTTAGTAACATCACAAGTCAGGCTCTTGGGAC CTAGGCGGAGGGGA hsa-mir-125b-2-CCTAGTCCCTGAGACCCTAACTTGTGAGGTATTTTAGT 122 precAACATCACAAGTCAGGCTCTTGGGACCTAGGC hsa-mir-126-precCGCTGGCGACGGGACATTATTACTTTTGGTACGCGCTG 123TGACACTTCAAACTCGTACCGTGAGTAATAATGCGCC GTCCACGGCA hsa-mir-126-precACATTATTACTTTTGGTACGCGCTGTGACACTTCAAAC 124 TCGTACCGTGAGTAATAATGCGChsa-mir-127-prec TGTGATCACTGTCTCCAGCCTGCTGAAGCTCAGAGGG 125CTCTGATTCAGAAAGATCATCGGATCCGTCTGAGCTTG GCTGGTCGGAAGTCTCATCATChsa-mir-127-prec CCAGCCTGCTGAAGCTCAGAGGGCTCTGATTCAGAAA 126GATCATCGGATCCGTCTGAGCTTGGCTGGTCGG hsa-mir-128a-precTGAGCTGTTGGATTCGGGGCCGTAGCACTGTCTGAGA 127GGTTTACATTTCTCACAGTGAACCGGTCTCTTTTTCAG CTGCTTC hsa-mir-128b-GCCCGGCAGCCACTGTGCAGTGGGAAGGGGGGCCGAT 128 precACACTGTACGAGAGTGAGTAGCAGGTCTCACAGTGAACCGGTCTCTTTCCCTACTGTGTCACACTCCTAATGG hsa-mir-128-precGTTGGATTCGGGGCCGTAGCACTGTCTGAGAGGTTTA 129CATTTCTCACAGTGAACCGGTCTCTTTTTCAGC hsa-mir-129-precTGGATCTTTTTGCGGTCTGGGCTTGCTGTTCCTCTCAA 130CAGTAGTCAGGAAGCCCTTACCCCAAAAAGTATCTA hsa-mir-130a-precTGCTGCTGGCCAGAGCTCTTTTCACATTGTGCTACTGT 131CTGCACCTGTCACTAGCAGTGCAATGTTAAAAGGGCA TTGGCCGTGTAGTG hsa-mir-131-1-gccaggaggcggGGTTGGTTGTTATCTTTGGTTATCTAGCTG 132 precTATGAGTGGTGTGGAGTCTTCATAAAGCTAGATAACC GAAAGTAAAAATAACCCCATACACTGCGCAGhsa-mir-131-3- CACGGCGCGGCAGCGGCACTGGCTAAGGGAGGCCCGT 133 precTTCTCTCTTTGGTTATCTAGCTGTATGAGTGCCACAGAGCCGTCATAAAGCTAgataaccgaaagtagaaatg hsa-mir-131-precGTTGTTATCTTTGGTTATCTAGCTGTATGAGTGTATTG 134GTCTTCATAAAGCTAGATAACCGAAAGTAAAAAC hsa-mir-132-precCCGCCCCCGCGTCTCCAGGGCAACCGTGGCTTTCGATT 135GTTACTGTGGGAACTGGAGGTAACAGTCTACAGCCAT GGTCGCCCCGCAGCACGCCCACGCGChsa-mir-132-prec GGGCAACCGTGGCTTTCGATTGTTACTGTGGGAACTG 136GAGGTAACAGTCTACAGCCATGGTCGCCC hsa-mir-133a-1ACAATGCTTTGCTAGAGCTGGTAAAATGGAACCAAAT 137CGCCTCTTCAATGGATTTGGTCCCCTTCAACCAGCTGT AGCTATGCATTGA hsa-mir-133a-2GGGAGCCAAATGCTTTGCTAGAGCTGGTAAAATGGAA 138CCAAATCGACTGTCCAATGGATTTGGTCCCCTTCAACC AGCTGTAGCTGTGCATTGATGGCGCCGhsa-mir-133-prec GCTAGAGCTGGTAAAATGGAACCAAATCGCCTCTTCA 139ATGGATTTGGTCCCCTTCAACCAGCTGTAGC hsa-mir-134-precCAGGGTGTGTGACTGGTTGACCAGAGGGGCATGCACT 140GTGTTCACCCTGTGGGCCACCTAGTCACCAACCCTC hsa-mir-134-precAGGGTGTGTGACTGGTTGACCAGAGGGGCATGCACTG 141TGTTCACCCTGTGGGCCACCTAGTCACCAACCCT hsa-mir-135-1-AGGCCTCGCTGTTCTCTATGGCTTTTTATTCCTATGTG 142 precATTCTACTGCTCACTCATATAGGGATTGGAGCCGTGGC GCACGGCGGGGACA hsa-mir-135-2-AGATAAATTCACTCTAGTGCTTTATGGCTTTTTATTCC 143 precTATGTGATAGTAATAAAGTCTCATGTAGGGATGGAAG CCATGAAATACATTGTGAAAAATCAhsa-mir-135-prec CTATGGCTTTTTATTCCTATGTGATTCTACTGCTCACTC 144ATATAGGGATTGGAGCCGTGG hsa-mir-136-precTGAGCCCTCGGAGGACTCCATTTGTTTTGATGATGGAT 145TCTTATGCTCCATCATCGTCTCAAATGAGTCTTCAGAG GGTTCT hsa-mir-136-precGAGGACTCCATTTGTTTTGATGATGGATTCTTATGCTC 146 CATCATCGTCTCAAATGAGTCTTChsa-mir-137-prec CTTCGGTGACGGGTATTCTTGGGTGGATAATACGGATT 147ACGTTGTTATTGCTTAAGAATACGCGTAGTCGAGG hsa-mir-138-1-CCCTGGCATGGTGTGGTGGGGCAGCTGGTGTTGTGAA 148 precTCAGGCCGTTGCCAATCAGAGAACGGCTACTTCACAA CACCAGGGCCACACCACACTACAGGhsa-mir-138-2- CGTTGCTGCAGCTGGTGTTGTGAATCAGGCCGACGAG 149 precCAGCGCATCCTCTTACCCGGCTATTTCACGACACCAGG GTTGCATCA hsa-mir-138-precCAGCTGGTGTTGTGAATCAGGCCGACGAGCAGCGCAT 150CCTCTTACCCGGCTATTTCACGACACCAGGGTTG hsa-mir-139-precGTGTATTCTACAGTGCACGTGTCTCCAGTGTGGCTCGG 151AGGCTGGAGACGCGGCCCTGTTGGAGTAAC hsa-mir-140TGTGTCTCTCTCTGTGTCCTGCCAGTGGTTTTACCCTAT 152GGTAGGTTACGTCATGCTGTTCTACCACAGGGTAGAA CCACGGACAGGATACCGGGGCACChsa-mir-140as- TCCTGCCAGTGGTTTTACCCTATGGTAGGTTACGTCAT 153 precGCTGTTCTACCACAGGGTAGAACCACGGACAGGA hsa-mir-140s-precCCTGCCAGTGGTTTTACCCTATGGTAGGTTACGTCATG 154CTGTTCTACCACAGGGTAGAACCACGGACAGG hsa-mir-141-precCGGCCGGCCCTGGGTCCATCTTCCAGTACAGTGTTGG 155ATGGTCTAATTGTGAAGCTCCTAACACTGTCTGGTAAA GATGGCTCCCGGGTGGGTTChsa-mir-141-prec GGGTCCATCTTCCAGTACAGTGTTGGATGGTCTAATTG 156TGAAGCTCCTAACACTGTCTGGTAAAGATGGCCC hsa-mir-142as-ACCCATAAAGTAGAAAGCACTACTAACAGCACTGGAG 157 precGGTGTAGTGTTTCCTACTTTATGGATG hsa-mir-142-precGACAGTGCAGTCACCCATAAAGTAGAAAGCACTACTA 158ACAGCACTGGAGGGTGTAGTGTTTCCTACTTTATGGAT GAGTGTACTGTG hsa-mir-142s-presACCCATAAAGTAGAAAGCACTACTAACAGCACTGGAG 159 GGTGTAGTGTTTCCTACTTTATGGATGhsa-mir-143-prec GCGCAGCGCCCTGTCTCCCAGCCTGAGGTGCAGTGCT 160GCATCTCTGGTCAGTTGGGAGTCTGAGATGAAGCACT GTAGCTCAGGAAGAGAGAAGTTGTTCTGCAGChsa-mir-143-prec CCTGAGGTGCAGTGCTGCATCTCTGGTCAGTTGGGAG 161TCTGAGATGAAGCACTGTAGCTCAGG hsa-mir-144-precTGGGGCCCTGGCTGGGATATCATCATATACTGTAAGTT 162TGCGATGAGACACTACAGTATAGATGATGTACTAGTC CGGGCACCCCC hsa-mir-144-precGGCTGGGATATCATCATATACTGTAAGTTTGCGATGA 163 GACACTACAGTATAGATGATGTACTAGTChsa-mir-145-prec CACCTTGTCCTCACGGTCCAGTTTTCCCAGGAATCCCT 164TAGATGCTAAGATGGGGATTCCTGGAAATACTGTTCTT GAGGTCATGGTT hsa-mir-145-precCTCACGGTCCAGTTTTCCCAGGAATCCCTTAGATGCTA 165AGATGGGGATTCCTGGAAATACTGTTCTTGAG hsa-mir-146-precCCGATGTGTATCCTCAGCTTTGAGAACTGAATTCCATG 166GGTTGTGTCAGTGTCAGACCTCTGAAATTCAGTTCTTC AGCTGGGATATCTCTGTCATCGThsa-mir-146-prec AGCTTTGAGAACTGAATTCCATGGGTTGTGTCAGTGTC 167AGACCTGTGAAATTCAGTTCTTCAGCT hsa-mir-147-precAATCTAAAGACAACATTTCTGCACACACACCAGACTA 168TGGAAGCCAGTGTGTGGAAATGCTTCTGCTAGATT hsa-mir-148-precGAGGCAAAGTTCTGAGACACTCCGACTCTGAGTATGA 169TAGAAGTCAGTGCACTACAGAACTTTGTCTC hsa-mir-149-precGCCGGCGCCCGAGCTCTGGCTCCGTGTCTTCACTCCCG 170TGCTTGTCCGAGGAGGGAGGGAGGGACGGGGGCTGT GCTGGGGCAGCTGGA hsa-mir-149-precGCTCTGGCTCCGTGTCTTCACTCCCGTGCTTGTCCGAG 171 GAGGGAGGGAGGGAChsa-mir-150-prec CTCCCCATGGCCCTGTCTCCCAACCCTTGTACCAGTGC 172TGGGCTCAGACCCTGGTACAGGCCTGGGGGACAGGGA CCTGGGGAC hsa-mir-150-precCCCTGTCTCCCAACCCTTGTACCAGTGCTGGGCTCAGA 173 CCCTGGTACAGGCCTGGGGGACAGGGhsa-mir-151-prec CCTGCCCTCGAGGAGCTCACAGTCTAGTATGTCTCATC 174CCCTACTAGACTGAAGCTCCTTGAGGACAGG hsa-mir-152-precTGTCCCCCCCGGCCCAGGTTCTGTGATACACTCCGACT 175CGGGCTCTGGAGCAGTCAGTGCATGACAGAACTTGGG CCCGGAAGGACC hsa-mir-152-precGGCCCAGGTTCTGTGATACACTCCGACTCGGGCTCTG 176GAGCAGTCAGTGCATGACAGAACTTGGGCCCCGG hsa-mir-153-1-CTCACAGCTGCCAGTGTCATTTTTGTGATCTGCAGCTA 177 precGTATTCTCACTCCAGTTGCATAGTCACAAAAGTGATCA TTGGCAGGTGTGGC hsa-mir-153-1-tctctctctccctcACAGCTGCCAGTGTCATTGTCACAAAAGTG 178 precATCATTGGCAGGTGTGGCTGCTGCATG hsa-mir-153-2-AGCGGTGGCCAGTGTCATTTTTGTGATGTTGCAGCTAG 179 precTAATATGAGCCCAGTTGCATAGTCACAAAAGTGATCA TTGGAAACTGTG hsa-mir-153-2-CAGTGTCATTTTTGTGATGTTGCAGCTAGTAATATGAG 180 precCCCAGTTGCATAGTCACAAAAGTGATCATTG hsa-mir-154-precGTGGTACTTGAAGATAGGTTATCCGTGTTGCCTTCGCT 181TTATTTGTGACGAATCATACACGGTTGACCTATTTTTC AGTACCAA hsa-mir-154-precGAAGATAGGTTATCCGTGTTGCCTTCGCTTTATTTGTG 182 ACGAATCATACACGGTTGACCTATTTTThsa-mir-155-prec CTGTTAATGCTAATCGTGATAGGGGTTTTTGCCTCCAA 183CTGACTCCTACATATTAGCATTAACAG hsa-mir-16-2-precCAATGTCAGCAGTGCCTTAGCAGCACGTAAATATTGG 184CGTTAAGATTCTAAAATTATCTCCAGTATTAACTGTGC TGCTGAAGTAAGGTTGACCATACTCTACAGTTGhsa-mir-181a-prec AGAAGGGCTATCAGGCCAGCCTTCAGAGGACTCCAAG 185GAACATTCAACGCTGTCGGTGAGTTTGGGATTTGAAAAAACCACTGACCGTTGACTGTACCTTGGGGTCCTTA hsa-mir-181b-TGAGTTTTGAGGTTGCTTCAGTGAACATTCAACGCTGT 186 precCGGTGAGTTTGGAATTAAAATCAAAACCATCGACCGTTGATTGTACCCTATGGCTAACCATCATCTACTCCA hsa-mir-181c-precCGGAAAATTTGCCAAGGGTTTGGGGGAACATTCAACC 187TGTCGGTGAGTTTGGGCAGCTCAGGCAAACCATCGACCGTTGAGTGGACCCTGAGGCCTGGAATTGCCATCCT hsa-mir-182-as-GAGCTGCTTGCCTCCCCCCGTTTTTGGCAATGGTAGAA 188 precCTCACACTGGTGAGGTAACAGGATCCGGTGGTTCTAGACTTGCCAACTATGGGGCGAGGACTCAGCCGGCAC hsa-mir-182-precTTTTTGGCAATGGTAGAACTCACACTGGTGAGGTAAC 189AGGATCCGGTGGTTCTAGACTTGCCAACTATGG hsa-mir-183-precCCGCAGAGTGTGACTCCTGTTCTGTGTATGGCACTGGT 190AGAATTCACTGTGAACAGTCTCAGTCAGTGAATTACCGAAGGGCCATAAACAGAGCAGAGACAGATCCACGA hsa-mir-184-precCCAGTCACGTCCCCTTATCACTTTTCCAGCCCAGCTTT 191GTGACTGTAAGTGTTGGACGGAGAACTGATAAGGGTA GGTGATTGA hsa-mir-184-precCCTTATCACTTTTCCAGCCCAGCTTTGTGACTGTAAGT 192 GTTGGACGGAGAACTGATAAGGGTAGGhsa-mir-185-prec AGGGGGCGAGGGATTGGAGAGAAAGGCAGTTCCTGA 193TGGTCCCCTCCCCAGGGGCTGGCTTTCCTCTGGTCCTT CCCTCCCA hsa-mir-185-precAGGGATTGGAGAGAAAGGCAGTTCCTGATGGTCCCCT 194 CCCCAGGGGCTGGCTTTCCTCTGGTCCTThsa-mir-186-prec TGCTTGTAACTTTCCAAAGAATTCTCCTTTTGGGCTTT 195CTGGTTTTATTTTAAGCCCAAAGGTGAATTTTTTGGGA AGTTTGAGCT hsa-mir-186-precACTTTCCAAAGAATTCTCCTTTTGGGCTTTCTGGTTTTA 196TTTTAAGCCCAAAGGTGAATTTTTTGGGAAGT hsa-mir-187-precGGTCGGGCTCACCATGACACAGTGTGAGACTCGGGCT 197ACAACACAGGACCCGGGGCGCTGCTCTGACCCCTCGTGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCA hsa-mir-188-precTGCTCCCTCTCTCACATCCCTTGCATGGTGGAGGGTGA 198GCTTTCTGAAAACCCCTCCCACATGCAGGGTTTGCAG GATGGCGAGCC hsa-mir-188-precTCTCACATCCCTTGCATGGTGGAGGGTGAGCTTTCTGA 199AAACCCCTCCCACATGCAGGGTTTGCAGGA hsa-mir-189-precCTGTCGATTGGACCCGCCCTCCGGTGCCTACTGAGCTG 200ATATCAGTTCTCATTTTACACACTGGCTCAGTTCAGCA GGAACAGGAGTCGAGCCCTTGAGCAAhsa-mir-189-prec CTCCGGTGCCTACTGAGCTGATATCAGTTCTCATTTTA 201CACACTGGCTCAGTTCAGCAGGAACAGGAG hsa-mir-190-precTGCAGGCCTCTGTGTGATATGTTTGATATATTAGGTTG 202TTATTTAATCCAACTATATATCAAACATATTCCTACAG TGTCTTGCC hsa-mir-190-precCTGTGTGATATGTTTGATATATTAGGTTGTTATTTAAT 203 CCAACTATATATCAAACATATTCCTACAGhsa-mir-191-prec CGGCTGGACAGCGGGCAACGGAATCCCAAAAGCAGC 204TGTTGTCTCCAGAGCATTCCAGCTGCGCTTGGATTTCG TCCCCTGCTCTCCTGCCThsa-mir-191-prec AGCGGGCAACGGAATCCCAAAAGCAGCTGTTGTCTCC 205AGAGCATTCCAGCTGCGCTTGGATTTCGTCCCCTGCT hsa-mir-192-2/3CCGAGACCGAGTGCACAGGGCTCTGACCTATGAATTG 206ACAGCCAGTGCTCTCGTCTCCCCTCTGGCTGCCAATTC CATAGGTCACAGGTATGTTCGCCTCAATGCCAGhsa-mir-192-prec GCCGAGACCGAGTGCACAGGGCTCTGACCTATGAATT 207GACAGCCAGTGCTCTCGTCTCCCCTCTGGCTGCCAATTCCATAGGTCACAGGTATGTTCGCCTCAATGCCAGC hsa-mir-193-precCGAGGATGGGAGCTGAGGGCTGGGTCTTTGCGGGCGA 208GATGAGGGTGTCGGATCAACTGGCCTACAAAGTCCCA GTTCTCGGCCCCCG hsa-mir-193-precGCTGGGTCTTTGCGGGCGAGATGAGGGTGTCGGATCA 209 ACTGGCCTACAAAGTCCCAGThsa-mir-194-prec ATGGTGTTATCAAGTGTAACAGCAACTCCATGTGGAC 210TGTGTACCAATTTCCAGTGGAGATGCTGTTACTTTTGA TGGTTACCAA hsa-mir-194-precGTGTAACAGCAACTCCATGTGGACTGTGTACCAATTTC 211 CAGTGGAGATGCTGTTACTTTTGAThsa-mir-195-prec AGCTTCCCTGGCTCTAGCAGCACAGAAATATTGGCAC 212AGGGAAGCGAGTCTGCCAATATTGGCTGTGCTGCTCC AGGCAGGGTGGTG hsa-mir-195-precTAGCAGCACAGAAATATTGGCACAGGGAAGCGAGTCT 213 GCCAATATTGGCTGTGCTGCThsa-mir-196-1- CTAGAGCTTGAATTGGAACTGCTGAGTGAATTAGGTA 214 precGTTTCATGTTGTTGGGCCTGGGTTTCTGAACACAACAACATTAAACCACCCGATTCACGGCAGTTACTGCTCC hsa-mir-196-1-GTGAATTAGGTAGTTTCATGTTGTTGGGCCTGGGTTTC 215 precTGAACACAACAACATTAAACCACCCGATTCAC hsa-mir-196-2-TGCTCGCTCAGCTGATCTGTGGCTTAGGTAGTTTCATG 216 precTTGTTGGGATTGAGTTTTGAACTCGGCAACAAGAAACTGCCTGAGTTACATCAGTCGGTTTTCGTCGAGGGC hsa-mir-196-precGTGAATTAGGTAGTTTCATGTTGTTGGGCCTGGGTTTC 217TGAACACAACAACATTAAACCACCCGATTCAC hsa-mir-197-precGGCTGTGCCGGGTAGAGAGGGCAGTGGGAGGTAAGA 218GCTCTTCACCCTTCACCACCTTCTCCACCCAGCATGGCC hsa-mir-198-precTCATTGGTCCAGAGGGGAGATAGGTTCCTGTGATTTTT 219 CCTTCTTCTCTATAGAATAAATGAhsa-mir-199a-1- GCCAACCCAGTGTTCAGACTACCTGTTCAGGAGGCTC 220 precTCAATGTGTACAGTAGTCTGCACATTGGTTAGGC hsa-mir-199a-2-AGGAAGCTTCTGGAGATCCTGCTCCGTCGCCCCAGTG 221 precTTCAGACTACCTGTTCAGGACAATGCCGTTGTACAGTAGTCTGCACATTGGTTAGACTGGGCAAGGGAGAGCA hsa-mir-199b-CCAGAGGACACCTCCACTCCGTCTACCCAGTGTTTAG 222 precACTATCTGTTCAGGACTCCCAAATTGTACAGTAGTCTGCACATTGGTTAGGCTGGGCTGGGTTAGACCCTCGG hsa-mir-199s-precGCCAACCCAGTGTTCAGACTACCTGTTCAGGAGGCTC 223TCAATGTGTACAGTAGTCTGCACATTGGTTAGGC hsa-mir-200a-precGCCGTGGCCATCTTACTGGGCAGCATTGGATGGAGTC 224AGGTCTCTAATACTGCCTGGTAATGATGACGGC hsa-mir-200b-CCAGCTCGGGCAGCCGTGGCCATCTTACTGGGCAGCA 225 precTTGGATGGAGTCAGGTCTCTAATACTGCCTGGTAATG ATGACGGCGGAGCCCTGCACGhsa-mir-202-prec GTTCCTTTTTCCTATGCATATACTTCTTTGAGGATCTGG 226CCTAAAGAGGTATAGGGCATGGGAAGATGGAGC hsa-mir-203-precGTGTTGGGGACTCGCGCGCTGGGTCCAGTGGTTCTTA 227ACAGTTCAACAGTTCTGTAGCGCAATTGTGAAATGTTTAGGACCACTAGACCCGGCGGGCGCGGCGACAGCGA hsa-mir-204-precGGCTACAGTCTTTCTTCATGTGACTCGTGGACTTCCCT 228TTGTCATCCTATGCCTGAGAATATATGAAGGAGGCTGGGAAGGCAAAGGGACGTTCAATTGTCATCACTGGC hsa-mir-205-precAAAGATCCTCAGACAATCCATGTGCTTCTCTTGTCCTT 229CATTCCACCGGAGTCTGTCTCATACCCAACCAGATTTCAGTGGAGTGAAGTTCAGGAGGCATGGAGCTGACA hsa-mir-206-precTGCTTCCCGAGGCCACATGCTTCTTTATATCCCCATAT 230GGATTACTTTGCTATGGAATGTAAGGAAGTGTGTGGT TTCGGCAAGTG hsa-mir-206-precAGGCCACATGCTTCTTTATATCCCCATATGGATTACTT 231TGCTATGGAATGTAAGGAAGTGTGTGGTTTT hsa-mir-208-precTGACGGGCGAGCTTTTGGCCCGGGTTATACCTGATGCT 232CACGTATAAGACGAGCAAAAAGCTTGTTGGTCA hsa-mir-210-precACCCGGCAGTGCCTCCAGGCGCAGGGCAGCCCCTGCC 233CACCGCACACTGCGCTGCCCCAGACCCACTGTGCGTGTGACAGCGGCTGATCTGTGCCTGGGCAGCGCGACCC hsa-mir-211-precTCACCTGGCCATGTGACTTGTGGGCTTCCCTTTGTCAT 234CCTTCGCCTAGGGCTCTGAGCAGGGCAGGGACAGCAAAGGGGTGCTCAGTTGTCACTTCCCACAGCACGGAG hsa-mir-212-precCGGGGCACCCCGCCCGGACAGCGCGCCGGCACCTTGG 235CTCTAGACTGCTTACTGCCCGGGCCGCCCTCAGTAACAGTCTCCAGTCACGGCCACCGACGCCTGGCCCCGCC hsa-mir-213-precCCTGTGCAGAGATTATTTTTTAAAAGGTCACAATCAAC 236ATTCATTGCTGTCGGTGGGTTGAACTGTGTGGACAAGCTCACTGAACAATGAATGCAACTGTGGCCCCGCTT hsa-mir-213-prec-GAGTTTTGAGGTTGCTTCAGTGAACATTCAACGCTGTC 237 LIMGGTGAGTTTGGAATTAAAATCAAAACCATCGACCGTT GATTGTACCCTATGGCTAACCATCATCTACTCChsa-mir-214-prec GGCCTGGCTGGACAGAGTTGTCATGTGTCTGCCTGTCT 238ACACTTGCTGTGCAGAACATCCGCTCACCTGTACAGCAGGCACAGACAGGCAGTCACATGACAACCCAGCCT hsa-mir-215-precATCATTCAGAAATGGTATACAGGAAAATGACCTATGA 239ATTGACAGACAATATAGCTGAGTTTGTCTGTCATTTCTTTAGGCCAATATTCTGTATGACTGTGCTACTTCAA hsa-mir-216-precGATGGCTGTGAGTTGGCTTAATCTCAGCTGGCAACTGT 240GAGATGTTCATACAATCCCTCACAGTGGTCTCTGGGATTATGCTAAACAGAGCAATTTCCTAGCCCTCACGA hsa-mir-217-precAGTATAATTATTACATAGTTTTTGATGTCGCAGATACT 241GCATCAGGAACTGATTGGATAAGAATCAGTCACCATCAGTTCCTAATGCATTGCCTTCAGCATCTAAACAAG hsa-mir-218-1-GTGATAATGTAGCGAGATTTTCTGTTGTGCTTGATCTA 242 precACCATGTGGTTGCGAGGTATGAGTAAAACATGGTTCCGTCAAGCACCATGGAACGTCACGCAGCTTTCTACA hsa-mir-218-2-GACCAGTCGCTGCGGGGCTTTCCTTTGTGCTTGATCTA 243 precACCATGTGGTGGAACGATGGAAACGGAACATGGTTCTGTCAAGCACCGCGGAAAGCACCGTGCTCTCCTGCA hsa-mir-219-precCCGCCCCGGGCCGCGGCTCCTGATTGTCCAAACGCAA 244TTCTCGAGTCTATGGCTCCGGCCGAGAGTTGAGTCTGGACGTCCCGAGCCGCCGCCCCCAAACCTCGAGCGGG hsa-mir-220-precGACAGTGTGGCATTGTAGGGCTCCACACCGTATCTGA 245CACTTTGGGCGAGGGCACCATGCTGAAGGTGTTCATGATGCGGTCTGGGAACTCCTCACGGATCTTACTGATG hsa-mir-221-precTGAACATCCAGGTCTGGGGCATGAACCTGGCATACAA 246TGTAGATTTCTGTGTTCGTTAGGCAACAGCTACATTGTCTGCTGGGTTTCAGGCTACCTGGAAACATGTTCTC hsa-mir-222-precGCTGCTGGAAGGTGTAGGTACCCTCAATGGCTCAGTA 247GCCAGTGTAGATCCTGTCTTTCGTAATCAGCAGCTACATCTGGCTACTGGGTCTCTGATGGCATCTTCTAGCT hsa-mir-223-precCCTGGCCTCCTGCAGTGCCACGCTCCGTGTATTTGACA 248AGCTGAGTTGGACACTCCATGTGGTAGAGTGTCAGTTTGTCAAATACCCCAAGTGCGGCACATGCTTACCAG hsa-mir-224-precGGGCTTTCAAGTCACTAGTGGTTCCGTTTAGTAGATGA 249TTGTGCATTGTTTCAAAATGGTGCCCTAGTGACTACAA AGCCC hsA-mir-29b-CTTCTGGAAGCTGGTTTCACATGGTGGCTTAGATTTTT 250 1 = 102-prec1CCATCTTTGTATCTAGCACCATTTGAAATCAGTGTTTT AGGAG hsA-mir-29b-CTTCAGGAAGCTGGTTTCATATGGTGGTTTAGATTTAA 251 2 = 102prec7.1 = 7.2ATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTT GGGGG hsA-mir-29b-CTTCAGGAAGCTGGTTTCATATGGTGGTTTAGATTTAA 252 3 = 102prec7.1 = 7.2ATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTT GGGGG hsa-mir-30* = mir-GTGAGCGACTGTAAACATCCTCGACTGGAAGCTGTGA 253 097-prec-6AGCCACAGATGGGCTTTCAGTCGGATGTTTGCAGCTG CCTACT mir-033bACCAAGTTTCAGTTCATGTAAACATCCTACACTCAGCT 254GTAATACATGGATTGGCTGGGAGGTGGATGTTTACTT CAGCTGACTTGGA mir-101-TGCCCTGGCTCAGTTATCACAGTGCTGATGCTGTCTAT 255 precursor-9 = mir-TCTAAAGGTACAGTACTGTGATAACTGAAGGATGGCA 101-3 mir-108-1-smallACACTGCAAGAACAATAAGGATTTTTAGGGGCATTAT 256GACTGAGTCAGAAAACACAGCTGCCCCTGAAAGTCCC TCATTTTTCTTGCTGT mir-108-2-smallACTGCAAGAGCAATAAGGATTTTTAGGGGCATTATGA 257TAGTGGAATGGAAACACATCTGCCCCCAAAAGTCCCT CATTTT mir-123-prec =CGCTGGCGACGGGACATTATTACTTTTGGTACGCGCTG 258 mir-126-prec TGACACTTCAAACTCGTACCGTGAGTAATAATGCGCCGTCCACGGCA mir-123-prec =ACATTATTACTTTTGGTACGCGCTGTGACACTTCAAAC 259 mir-126-precTCGTACCGTGAGTAATAATGCGC mir-129-1-precTGGATCTTTTTGCGGTCTGGGCTTGCTGTTCCTCTCAA 260CAGTAGTCAGGAAGCCCTTACCCCAAAAAGTATCTA mir-129-small-TGCCCTTCGCGAATCTTTTTGCGGTCTGGGCTTGCTGT 261 2 = 129b?ACATAACTCAATAGCCGGAAGCCCTTACCCCAAAAAG CATTTGCGGAGGGCG mir-133b-smallGCCCCCTGCTCTGGCTGGTCAAACGGAACCAAGTCCG 262TCTTCCTGAGAGGTTTGGTCCCCTTCAACCAGCTACAG CAGGG mir-135-small-AGATAAATTCACTCTAGTGCTTTATGGCTTTTTATTCC 263TATGTGATAGTAATAAAGTCTCATGTAGGGATGGAAG CCATGAAATACATTGTGAAAAATCAmir-148b-small AAGCACGATTAGCATTTGAGGTGAAGTTCTGTTATAC 264ACTCAGGCTGTGGCTCTCTGAAAGTCAGTGCAT mir-151-precCCTGTCCTCAAGGAGCTTCAGTCTAGTAGGGGATGAG 265ACATACTAGACTGTGAGCTCCTCGAGGGCAGG mir-155-CTGTTAATGCTAATCGTGATAGGGGTTTTTGCCTCCAA 266 prec(BIC)CTGACTCCTACATATTAGCATTAACAG mir-156 = mir-CCTAACACTGTCTGGTAAAGATGGCTCCCGGGTGGGT 267 157 = overlap mir-TCTCTCGGCAGTAACCTTCAGGGAGCCCTGAAGACCA 141 TGGAGGAC mir-158-small =GCCGAGACCGAGTGCACAGGGCTCTGACCTATGAATT 268 mir-192GACAGCCAGTGCTCTCGTCTCCCCTCTGGCTGCCAATTCCATAGGTCACAGGTATGTTCGCCTCAATGCCAGC mir-159-1-smallTCCCGCCCCCTGTAACAGCAACTCCATGTGGAAGTGC 269CCACTGGTTCCAGTGGGGCTGCTGTTATCTGGGGCGA GGGCCA mir-161-smallAAAGCTGGGTTGAGAGGGCGAAAAAGGATGAGGTGA 270CTGGTCTGGGCTACGCTATGCTGCGGCGCTCGGG mir-163-1b-smallCATTGGCCTCCTAAGCCAGGGATTGTGGGTTCGAGTC 271 CCACCCGGGGTAAAGAAAGGCCGAATTmir-163-3-small CCTAAGCCAGGGATTGTGGGTTCGAGTCCCACCTGGG 272GTAGAGGTGAAAGTTCCTTTTACGGAATTTTTT mir-175-GGGCTTTCAAGTCACTAGTGGTTCCGTTTAGTAGATGA 273 small = mir-224TTGTGCATTGTTTCAAAATGGTGCCCTAGTGACTACAA AGCCC mir-177-smallACGCAAGTGTCCTAAGGTGAGCTCAGGGAGCACAGAA 274ACCTCCAGTGGAACAGAAGGGCAAAAGCTCATT mir-180-smallCATGTGTCACTTTCAGGTGGAGTTTCAAGAGTCCCTTC 275CTGGTTCACCGTCTCCTTTGCTCTTCCACAAC mir-187-precGGTCGGGCTCACCATGACACAGTGTGAGACTCGGGCT 276ACAACACAGGACCCGGGGCGCTGCTCTGACCCCTCGTGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCA mir-188-precTGCTCCCTCTCTCACATCCCTTGCATGGTGGAGGGTGA 277GCTTTCTGAAAACCCCTCCCACATGCAGGGTTTGCAG GATGGCGAGCC mir-190-precTGCAGGCCTCTGTGTGATATGTTTGATATATTAGGTTG 278TTATTTAATCCAACTATATATCAAACATATTCCTACAG TGTCTTGCC mir-197-2GTGCATGTGTATGTATGTGTGCATGTGCATGTGTATGT 279 GTATGAGTGCATGCGTGTGTGCmir-197-prec GGCTGTGCCGGGTAGAGAGGGCAGTGGGAGGTAAGA 280GCTCTTCACCCTTCACCACCTTCTCCACCCAGCATGGCC mir-202-precGTTCCTTTTTCCTATGCATATACTTCTTTGAGGATCTGG 281CCTAAAGAGGTATAGGGCATGGGAAGATGGAGC mir-294-1 (chr16)CAATCTTCCTTTATCATGGTATTGATTTTTCAGTGCTTC 282 CCTTTTGTGTGAGAGAAGATAmir-hes1 ATGGAGCTGCTCACCCTGTGGGCCTCAAATGTGGAGG 283AACTATTCTGATGTCCAAGTGGAAAGTGCTGCGACAT TTGAGCGTCACCGGTGACGCCCATATCAmir-hes2 GCATCCCCTCAGCCTGTGGCACTCAAACTGTGGGGGC 284ACTTTCTGCTCTCTGGTGAAAGTGCCGCCATCTTTTGA GTGTTACCGCTTGAGAAGACTCAACCmir-hes3 CGAGGAGCTCATACTGGGATACTCAAAATGGGGGCGC 285TTTCCTTTTTGTCTGTTACTGGGAAGTGCTTCGATTTTG GGGTGTCCCTGTTTGAGTAGGGCATChsa-mir-29b-1 CTTCAGGAAGCTGGTTTCATATGGTGGTTTAGATTTAA 664ATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTT GGGGG *An underlined sequencewithin a precursor sequence represents a processed miR transcript. Allsequences are human.

Genome Analysis

The BUILD 33 and BUILD 34 Version 1 of the Homo sapiens genome,available at the NCBI website (see above), was used for genome analysis.For each human miR present in the miR database, a BLAST search wasperformed using the default parameters against the human genome to findthe precise location, followed by mapping using the maps available atthe Human Genome Resources at the NCBI website. See also Altschul et al.(1990), J. Mol. Biol. 215:403-10 and Altschul et al. (1997), NucleicAcids Res. 25:3389-3402, the entire disclosures of which are hereinincorporated by reference, for a discussion of the BLAST searchalgorithm. Also, as a confirmation of the data, the human clonecorresponding to each miR was identified and mapped to the human genome(see Table 2). Perl scripts for the automatic submission of BLAST jobsand for the retrieval of the search results were based on the LPW, HTML,and HTPP Perl modules and BioPerl modules.

Fragile Site Database

This database was constructed using the Virtual Gene NomenclatureWorkshop, maintained by the HUGO Gene Nomenclature Committee atUniversity College, London. For each FRA locus, the literature wasscreened for publications reporting the cloning of the locus. In tencases, genomic positions for both centromeric and telomeric ends werefound. The total genomic length of these FRA loci is 26.9 Mb. Intwenty-nine cases, only one anchoring marker was identified. It wasdetermined, based on the published data, that 3 Mb can be used as themedian length for each FRA locus. Therefore, 3 Mb was used as aguideline or window length for considering whether miR were in closeproximity to the FRA sites.

The human clones for seventeen HPV 16 integration sites (1S) were alsoprecisely mapped on the human genome. By analogy with the length of aFRA, in the case of HPV16 integration sites, “close” vicinity wasdefined to be a distance of less than 2 Mb.

PubMed Database

The PubMed database was screened on-line for publications describingcancer-related abnormalities such as minimal regions ofloss-of-heterozygosity (minimal LOH) and minimal regions ofamplification (minimal amplicons) using the words “LOH and genome-wide,”“amplification and genome-wide” and “amplicon and cancer.” The PubMeddatabase is maintained by the NCBI and was accessed via its website. Thedata obtained from thirty-two papers were used to screen for putativeCAGRs, based on markers with high frequency of LOH/amplification. As asecond step, a literature search was performed to determine the presenceor absence of the above three types of alterations and to determine theprecise location of miRs with respect to CAGRs (see above). Searchphrases included the combinations “minimal regions of LOH AND cancer”,and “minimal region amplification AND cancer.” A total of 296publications were found and manually curated to find regions defined byboth telomeric and centromeric markers. One hundred fifty-four minimallydeleted regions (median length-4.14 Mb) and 37 minimally amplifiedregions (median length-2.45 Mb) were identified with precise genomicmapping for both telomeric and centromeric ends involving all humanchromosomes except Y. To identify common breakpoint regions, PubMed wassearched with the combination “translocation AND cloning AND breakpointAND cancer.” The search yielded 308 papers, which were then manuallycurated. Among these papers, 45 translocations with at least onebreakpoint precisely mapped were reported.

Statistical Analyses

The incidence of miR genes and their association with specificchromosomes and chromosome regions, such as FRAs and amplified ordeleted regions in cancer, was analyzed with random effect Poissonregression models. Under these models, “events” are defined as thenumber of miR genes, and non-overlapping lengths of the region ofinterest defined exposure “time” (i.e., fragile site versus non-fragilesite, etc.). The “length” of a region was exactly ±1 Mb, if known, orestimated as ±1 Mb if unknown. The random effect used was chromosomallocation, in that data within a chromosome were assumed to becorrelated. The fixed effect in each model consisted of an indicatorvariable(s) for the type of region. This model provided the incidencerate ratio (IRR), 2-sided 95% confidence interval of the IRR, and2-sided p-values for testing the hypothesis that the IRR is 1.0. An IRRsignificantly greater than 1 indicates an increase in the number of miRgenes within a region.

Each model was repeated considering the distribution of miR genes onlyin the transcriptionally active portion of the genome (about 43% of thegenome using the published data), rather than the entire chromosomelength, and similar results were obtained. Considering the distributionof miRs only in the transcriptionally active portion of the genome ismore conservative, and takes into account the phenomenon of clusteringthat was observed for the miR genes' genomic location. All computationswere completed using STATA v7.0.

Patient Samples and Cell Lines

Patient samples were obtained from twelve chronic lymphocytic leukemia(CLL) patients, and mononuclear cells were isolated throughFicoll-Hypaque gradient centrifugation (Amersham Pharmacia Biotech,Piscataway, N.J.), as previously described (Calin et al., Proc. Natl.Acad. Sci. USA 2002, 99:15524-15529). Samples were then processed forRNA and DNA extraction according to standard protocols as described inSambrook J et al. (1989), Molecular cloning: A Laboratory Manual (ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.), the entiredisclosure of which is herein incorporated by reference.

Seven human lung cancer cell lines were obtained from the American TypeCulture Collection (ATCC; Manassas, Va.) and maintained according toATCC instructions. These cell lines were: Calu-3, H1299, H522, H460,H23, H1650 and H1573.

Northern Blotting

Total RNA isolation from patient samples and cell lines described abovewas performed using the Tri-Reagent protocol (Molecular Research Center,Inc). RNA samples (30 μg each) were run on 15% acryl amide denaturing(urea) Criterion recast gels (Bio-Red Laboratories, Hercules, Calif.)and then transferred onto Hyoid-N+ membrane (Amersham PharmaciaBiotech), as previously described (Calin et al., Proc. Natl. Acad. Sci.USA 2002, 99:15524-15529). Hybridization with gamma-³²P ATP labeledprobes was performed at 42° C. in 7% SDS, 0.2 M Na₂PO₄, pH 7.0overnight. Membranes were washed at 42° C., twice in 2×SSPE, 0.1% SDSand twice with 0.5×SSPE, 0.1% SDS. Blots were stripped by boiling in0.1% aqueous SDS/0.1×SSC for 10 minutes, and were reprobed severaltimes. As a gel loading control, 5S rRNA was also loaded and was stainedwith ethidium bromide. Lung tissue RNA was utilized as the normalcontrol; normal lung total RNA was purchased from Clontech (Palo Alto,Calif.).

Example 1 miR Genes are Non-Randomly Distributed in The Human Genome

One hundred eighty-six human genes representing known or predicted miRgenes were mapped, based on mouse homology or computational methods, asdescribed above in the General Methods. The results are presented inTable 2. The names were as in the miRNA Registry; for new miR genes,sequential names were assigned. miR 213 from Sanger database isdifferent from miR 213 described in Lim et al. (2003, Science 299:1540).MiR genes in clusters are separated by a forward slash “/”. Theapproximate location in Mb of each clone is presented in the lastcolumn.

TABLE 2 miR Database: Chromosome Location and Clustering Chromosome Loc(Mb) Name location Genes in Cluster (built 33) let-7a-1 09q22.2let-7a-1/let-7f-1/let-7d  90.2-.3 let-7a-2 11q24.1miR-125b-1/let-7a-2/miR-100   121.9-122.15 let-7a-3 22q13.3let-7a-3/let-7b  44.7-.8 let-7b 22q13.3 let-7a-3/let-7b  44.7-.8 let-7c21q11.2 miR-99a/let-7c/miR-125b-2  16.7-.9 let-7d 09q22.2let-7a-1/let-7f-1/let7d  90.2-.3 let-7e 19q13.4 miR-99b/let-7e/miR-125a 56.75-57 let-7f-1 09q22.2 let-7a-1/let-7f-1/let7d  90.2-.3 let-7f-2Xp11.2 miR-98/let-7f-2  52.2-.3 let-7g 03p21.3 let-7g/miR-135-1  52.1-.3let-7i 12q14.1  62.7-.9 miR- 20q13.3 miR-133a-2/miR-1b-2 61.75-.8 001b-2miR-001d 18q11.1 miR-133a-1/miR-1d 19.25-.4 miR-007-1 09q21.33   80-80.1 miR-007-2 15q25  86.7-.8 miR-007-3 19p13.3   4.7-.75 miR-009-01q22 153.1-.2 1 (=miR- 131-1) miR-009- 05q14  87.85-88 2 (=miR- 131-2)miR-009- 15q25.3 87.5  3 (=miR- 131-3) miR-010a 17q21.3miR-196-1/miR-10a   46.95-47.05 miR-010b 02q31  176.85-177 miR-015a13q14 miR-16a/miR-15a  49.5-.8 miR-015b 03q26.1 miR-15b/miR-16b161.35-.5  miR-016a 13q14 miR-16a/miR-15a  49.5-.8 miR-016b 03q26.1miR-15b/miR-16b 161.35-.5  miR-017 13q31miR-17/miR-18/miR-19a/miR-20/miR- 90.82 (=miR- 19b-1/miR-92-1 91)miR-018 13q31 miR-17/miR-18/miR-19a/miR-20/miR- 90.82 19b-1/miR-92-1miR-019a 13q31 miR-17/miR-18/miR-19a/miR-20/miR- 90.82 19b-1/miR-92-1miR- 13q31 miR-17/miR-18/miR-19a/miR-20/miR- 90.82 019b-1 19b-1/miR-92-1miR- Xq26.2 miR-92-2/miR-19b-2/miR-106a 131.2-.3 019b-2 miR-020 13q31miR-17/miR-18/miR-19a/miR-20/miR- 90.82 19b-1/miR-92-1 miR-021 17q23.2 58.25-.35 (=miR104- as) miR-022 17p13.3  1.4-.6 miR-023a 19p13.2miR-24-2/miR-27a/miR-23a/miR-  13.75-.95 181c miR-023b 09q22.1miR-24-1/miR27b/miR-23b   90.8-91 miR-024- 09q22.1miR-24-1/miR27b/miR-23b   90.8-91 1 (=miR- 189) miR-024-2 19p13.2miR-24-2/miR-27a/miR-23a/miR-  13.75-.95 181c miR-025 07q22miR-106b/miR-25/miR-93-1 99.25-.4 miR-026a 03p21  37.8-.9 miR-026b 02q35219.1-.3 miR-027a 19p13.2 miR-24-2/miR-27a/miR-23a/miR-  13.75-.95 181cmiR-027b 09q22.1 miR-24-1/miR27b/miR-23b   90.8-91 miR-028 03q28189.65-.85 miR-029a 07q32 miR-29a/miR29b   129.9-130.1 miR-029b 07q32miR-29a/miR29b   129.9-130.1 (=miR- 102-7.1) miR-029c 01q32.2-32.3miR-29c/miR-102 204.6-.7 miR-030a- 06q12-13 72.05-.2 as miR-030a-06q12-13 72.05-.2 s (=miR- 097) miR-030b 08q24.2 miR-30d/miR-30b 135.5 miR-030c 06q13  71.95-72.1 miR-030d 08q24.2 miR-30d/miR-30b 135.5 miR-031 09p21  21.3-.5 miR-032 09q31.2 105.1-.3 miR-033a 22q13.2 40.5-.8 miR-033b 17p11.2  17.6-.7 miR-034 01p36.22 8.8 (=miR- 170)miR-034a-1 11q23 miR-34a-2/miR 34a-1 111.3-.5 miR-034a-2 11q23miR-34a-2/miR 34a-1 111.3-.5 miR-092-1 13q31miR-17/miR-18/miR-19a/miR-20/miR- 90.82 19b-1/miR-92-1 miR-092-2 Xq26.2miR-92-2/miR-19b-2/miR-106a 131.2-.3 miR-093-1 07q22miR-106b/miR-25/miR-93-1 99.25-.4 miR-095 04p16    8-.2 miR-096 07q32miR-182s/miR-182as/miR-96/miR-   128.9-129 183 miR-098 Xp11.2miR-98/let-7f-2  52.2-.3 miR-099a 21q11.2 miR-99a/let-7c/miR-125b-2 16.7-.9 miR-099b 19q13.4 miR-99b/let-7e/miR-125a  56.75-57 miR-10011q24.1 miR-125b-1/let-7a-2/miR-100   121.9-122.15 miR-101-1 01p31.3 64.85-95 miR-101-2 09p24   4.8-5 miR-102 01q32.2-32.3 miR-29c/miR-102204.5-.7 miR-103-1 05q35.1  167.8-.95 miR-103-2 20p13  3.82-.90miR-105-1 Xq28 149.3-.4 miR-106b 07q22 miR-106b/miR-25/miR-93-1 99.25-.4(=miR- 94) miR-106a Xq26.2 miR-92-2/miR-19b-2/miR-106a 131.2-.3 miR-10710q23.31 91.45-.6 miR-108-1 17q11.1 miR-108-1/miR-193  29.6-.8 miR-108-216q13.1  14.3-.5 miR-122a 18q21  55.85-56 miR-123 09q34   132.9-133.05(=miR- 126) miR-124a-1 08p23   9.5-.65 miR-124a-2 08q12.2   64.9-65.1miR-124a-3 20q13.33  62.4-.55 miR-125a 19q13.4 miR-99b/let-7e/miR-125a 56.75-57 miR-125b-1 11q24.1 miR-125b1/let-7a-2/miR-100   121.9-122.1miR-125b-2 21q11.2 miR-99a/let-7c/miR-125b-2  16.7-.9 miR-127 14q32miR-127/miR-136  99.2-.4 miR-128a 02q21 136.3-.5 miR-128b 03p22 35.45-.6miR-129-1 07q32 127.25-.4  miR-129-2 11p11.2  43.65-.75 miR-130a 11q12 57.6-.7 miR-130b 22q11.1  20.2-.4 miR-132 17p13.3 miR-212/miR-132 1.85-2 miR-133a-1 18q11.1 miR-133a-1/miR-1d 19.25-.4 miR-133a-2 20q13.3miR-133a-2/miR-1b-2 61.75-.8 miR-133b 06p12 miR-206/miR-133b   51.9-52miR-134 14q32 miR-154/miR-134/miR-299  99.4-.6 miR-135-1 03p21.3let-7g/miR-135-1  52.1-.3 miR-135-2 12q23  97.85-98 miR-136 14q32miR-127/miR-136  99.2-.4 miR-137 01p21-22 97.75 miR-138-1 03p21 43.85-.95 miR-138-2 16q12-13 56.55-.7 miR-139 11q13 72.55-.7 miR-140as16q22.1  69.6-.8 miR-140s 16q22.1  69.6-.8 miR-141 12p13 overlap miR-156—cluster   6.9-7.05 (=overlap miR-156) miR-142- 17q23 56.75-.9 asmiR-142-s 17q23 56.75-.9 miR-143 05q32-33 miR-145/miR-143 148.65-.8 miR-144 17q11.2 27.05 miR-145 05q32-33 miR-145/miR-143 148.65-.8 miR-146 05q34 159.8-.9 miR-147 09q33 116.35-.55 miR-148 07p15  25.6-.8miR-148b 12q13  54.35-.45 miR-149 02q37.3 241.3-.4 miR-150 19q13 54.6-.8 miR-151 08q24.3 141.4-.5 miR-152 17q21  46.4-.5 miR-153-1 02q36220.1-.2 miR-153-2 07q36 156.5-.7 miR-154 14q32 miR-154/miR-134/miR-299 99.4-.6 miR-155 21q21 25.85 (BIC) miR-156 12p13 overlap miR-141—cluster   6.9-7.05 (=miR- 157) miR-159-1 11q13 miR-159-1/miR-192  64.9-65 miR-161 08p21  21.8-.9 miR-175 Xq28 148.8-.9 (=miR- 224)miR-177 08p21  21.25-.35 miR-180 22q11.21-12.2 26.45 miR-181a09q33.1-34.13 120.85-.95 (=miR- 178-2) miR-181b 01q31.2-q32.1 miR-213S/miR-181b  195.2-.35 (=miR- 178 = miR-213- LIM) miR-181c 19p13.3miR-24-2/miR-27a/miR-23a/miR-  13.75-.95 181c miR-182- 07q32miR-182s/miR-182as/miR-96/miR-  128.9-129 as 183 miR-182-s 07q32miR-182s/miR-182as/miR-96/miR-  128.9-129 183 miR-183 07q32miR-182s/miR-182as/miR-96/miR-  128.9-129 (=miR- 183 174) miR-184 15q24  76.9-77.1 miR-185 22q11.2  18.35-.45 miR-186 01p31   70.9-71 miR-18718q12.1 33.25-.4 miR-188 Xp11.23-p11.2 48.35-.5 miR-190 15q21  60.6-.8miR-191 03p21  48.85-.95 miR-192 11q13 miR-159-1/miR-192   64.9-65(=miR- 158) miR-193 17q11.2 miR-108-1/miR-193  29.6-.8 miR-194 01q41miR-215/miR-194 216.7-.8 (=miR- 159-2) miR-195 17p13  6.75-.85 miR-196-117q21 miR-196-1/miR-10a   46.9-47.1 miR-196-2 12q13     54-54.15 miR-19701p13 109.2-.3 miR-198 03q13.3 121.3-.4 miR-199a- 19p13.2 10.75-.8 1(=miR- 199s) miR-199a-2 01q23.3 miR-214/miR-199a-2 168.7-.8 miR-199as19p13.2 10.75-.8 (= antisense miR-199a- 1) miR-199b 09q34 124.3-.5(=miR- 164) miR-200 01p36.3   0.9-1 miR-202 10q26.3 135    miR-20314q32.33 102.4-.6 miR-204 09q21.1   66.9-67 miR-205 01q32.2 206.2-.3miR-206 06p12 miR-206/miR-133b   51.9-52 miR-208 14q11.2   21.8-22miR-210 11p15   0.55-0.75 miR-211 15q11.2-q12   28.9-29.1 miR-21217p13.3 miR-212/miR-132   1.9-2.1 miR-213- 01q31.3-q32.1 miR-213S/miR-181b  195.2-.35 SANGER miR-214 01q23.3 miR-214/miR-199a-2 168.7-.8miR-215 01q41 miR-215/miR-194 216.7-.8 miR-216 02p16 miR-217/miR-216 56.2-.4 miR-217 02p16 miR-217/miR-216  56.2-.4 miR-218-1 04p15.32 20.15-.35 miR-218-2 05q35.1   168-.15 miR-219 06p21.2-21.31  33.1-.25miR-220 Xq25 120.6-.8 miR-221 Xp11.3 miR-222/miR-221  44.35-.45 miR-222Xp11.3 miR-222/miR-221  44.35-.45 miR-223 Xq12-13.3  63.4-.5 miR-294-116q22  65.1-.3 miR-297-3 20q13.2  52.25-.35 miR-299 14q32miR-154/miR-134/miR-299  99.4-.6 miR-301 17q23  57.5-.7 miR-302 04q25  113.9-114 miR-hes1 19q13.4 miR-hes1/miR-hes2/miR-hes3   58.9-59.05miR-hes2 19q13.4 miR-hes1/miR-hes2/miR-hes3   58.9-59.05 miR-hes319q13.4 miR-hes1/miR-hes2/miR-hes3   58.9-59.05

The distribution of the 186 human miR genes was found to be non-random.Ninety miR genes were located in 36 clusters, usually with two or threegenes per cluster (median=2.5). The largest cluster found comprises sixgenes (miR-17/miR-18/miR-19a/miR-20/miR-19b1/miR-92-1) and is located at13q31 (Table 2). A significant association of the incidence of miR geneswith specific chromosomes was found. Chromosome 4 was found to have alower than expected rate of miR genes (IRR=0.27; p=0.035). Chromosomes17 and 19 were found to have significantly more miR genes than expected,based on chromosome size (IRR=2.97, p=0.002 and IRR=3.39, p=0.001,respectively). Six of the 36 miR gene clusters (17%), which contain 16of 90 clustered genes (18%), are located on these two small chromosomes,which account for only 5% of the entire genome.

Similar results were obtained using a model considering the distributionof miR genes only in the transcriptionally active portion of the genome(see Table 3).

Chromosome 1 is used as the baseline in the model with a rate of miRgene incidence of ˜0.057, which is approximately equal to the overallrate of miR gene incidence across the genome.

TABLE 3 Location of miRs by chromosome and results of mixed effectsPoisson regression model. Chromosome Length # of miRs IRR p 1 279 16 — —2 251 7 0.49 0.112 3 221 10 0.79 0.557 4 197 3 0.27 0.035 5 198 6 0.530.183 6 176 6 0.59 0.277 7 163 13 1.39 0.377 8 148 6 0.71 0.469 9 140 151.87 0.082 10 143 2 0.24 0.060 11 148 11 1.29 0.508 12 142 6 0.74 0.52313 118 8 0 1.000 14 107 7 1.14 0.771 15 100 5 0.87 0.789 16 104 5 0.840.731 17 88 15 2.97 0.002 18 86 4 0.81 0.708 19 72 14 3.39 0.001 20 66 51.32 0.587 21 45 4 1.55 0.433 22 48 6 2.09 0.123 X 163 12 1.28 0.513 Y51 0 0 1.000

Example 2 miR Genes Are Located in or Near Fragile Sites

Thirty-five of 186 miRs (19%) were found in (13 miR genes), or within 3Mb (22 miR genes) of cloned fragile sites (FRA). A set of 39 fragilesites with available cloning information was used in the analysis. Datawere available for the exact dimension (mean 2.69 Mb) and position often of these cloned fragile sites (see General Methods above). Therelative incidence of miR genes inside fragile sites occurred at a rate9.12 times higher than in non-fragile sites (p<0.001, using mixed effectPoisson regression models; see Tables 3 and 4). The same very highstatistical significance was also found when only the 13 miRs locatedexactly inside a FRA or exactly in the vicinity of the “anchoring”marker mapped for a FRA were considered (IRR=3.21, p<0.001). Among thefour most active common fragile sites (FRA3B, FRA16D, FRA6E, and FRA7H),the data demonstrate seven miRs in (miR-29a and miR-29b) or close(miR-96, miR-182s, miR-182 as, miR-183, and miR-129-1) to FRA7H, theonly fragile site where no candidate tumor suppressor (TS) gene has beenfound. The other three of the four most active sites contain known orcandidate TS genes; i.e., FHIT, WWOX and PARK2, respectively (Ohta etal., 1996, Cell 84:587-597; Paige et al., 2001, Proc. Natl. Acad. Sci.USA 98:11417-11422; Cesari et al., 2003, Proc. Natl. Acad. Sci. USA100:5956-5961).

Analysis of 113 fragile sites scattered in the human karyotype showedthat 61 miR genes are located in the same cytogenetic positions withFRAs. Thirty-five miR genes were located inside twelve cloned FRAs.These data indicate that more miRs are located in or near FRAs, and thatthe results described herein represent an underestimation of miRgene/FRA association, likely because the mapping of these unstableregions is not complete.

TABLE 4 Mixed Effect Poisson Regression Results for the AssociationBetween microRNAs and Several Types of Regions of Interest IncidenceRate Ratio 95% CI Region of interest (IRR) IRR p Cloned Fragile sitesvs. non- 9.12 6.22, <0.001 fragile sites 13.38 HPV16 insertion vs. allother 3.22 1.55, 6.68 <0.002 Deleted region vs. all other 4.08 2.99,5.56 <0.001 Amplified region vs. all other 3.97 2.31, 6.83 <0.001 HOXClusters vs. all other 15.77 7.39, <0.001 33.62 Homeobox genes vs. allother 2.95 1.63, 5.34 <0.001 Note: “all other” means all the genomeexcept the regions of interest.

Example 3 miR Genes Are Located in or Near Human Papilloma Virus (HPV)Integration Sites

Because common fragile sites are preferential targets for HPV 16integration in cervical tumors, and infection with HPV16 or HPV18 is themajor risk factor for developing cervical cancer, the associationbetween miR gene locations and HPV16 integration sites in cervicaltumors was analyzed. The data indicate that thirteen miR genes (7%) arelocated within 2.5 Mb of seven of seventeen (45%) cloned integrationsites. The relative incidence of miRs at HPV16 integration sitesoccurred at a rate 3.22 times higher than in the rest of the genome(p<0.002) (Tables 4 and 5). In one cluster of integration sites atchromosome 17q23, where three HPV16 integration sites are spread overroughly 4 Mb of genomic sequence, four miR genes (miR-21, miR-301,miR-142s and miR-142 as) were found.

TABLE 5 Analyzed FRA Sites, Cancer Correlation and HPV Integration SitesDistance Location miR- HPV16 Symbol Chromosome Cancer correlation Type(Mb) Closest miR(s) FRA(Mb) integration* FRA1A 1p36 FRA1C 1p31aphidicolin type, 67.87 miR-186; miR- 3; 3 common 101-1 FRA1F 1q21bladder FRA1H 1q42.1 cervical 5-azacytidine, 216.5 miR-194; miR- exactYES common 215 FRA2G 2q31 RCC FRA2I 2q33 chronic myelogenous leukemiaFRA3B 3p14.2 esophageal carcinoma, lung, stomach, kidney, cervicalcancer FRA4B 4q12 FRA4C 4q31.1 FRA5C 5q31.1 FRA5E 5p14 FRA6E 6q26ovarian FRA6F 6q21 leukemias and solid tumors FRA7E 7q21.2 or 21.11FRA7F 7q22 aphidicolin type, 100.2-107   miR-106b; miR- less than 1common 25; miR-93 FRA7G 7q31.2 ovarian FRA7H 7q32.3 esophagealaphidicolin type, 129.8-130.4 miR-29b; miR- exact; 1 common 29a; and 2.5miR-96; miR- 182s; miR-182as; miR-183; miR- 129-1 FRA7I 7q35 breastFRA8B 8q22.1 FRA8E 8q24.1 FRA9D 9q22.1 bladder aphidicolin type,89.5-92   let7a-1; let-7d; exact common let-7f-1 miR-23b; miR-24-1; miR-27b FRA9E 9q32-33.1 ovarian, bladder, aphidicolin type, 101.3-111.9miR-32 exact YES cervical common FRA10B 10q25.2 FRA10C 10q21 FRA10D10q22.1 FRA11A 11q13.3 hematopoietic and solid folic acid type,66.18-66.9  miR-159-1; miR- 1.2 tumors rare 192 FRA11B 11q23.3 folicacid type, 119.1-.2   miR-125b-1; let- 2 rare 7a-2; miR-100 FRA12A12q13.1 folic acid type, 53.55 miR-196-2; miR- 1 rare 148b FRA13C13q21.2 FRA15A 15q22 aphidicolin type, 60.93 miR-190 exact common FRA16D16q23.2 gastric adenocarcinoma, adenocarcinomas of stomach, colon, lungand ovary FRA16E 16p12.1 FRA17B 17q23.1 aphidicolin type, 58.25-58.35miR-21 exact YES common miR-301 0.5 miR-142s; miR- 1.5 142as FRA18A18q12.2 esophageal carcinoma FRA22A 22q13 FRAXA Xq27.31 FRAXB Xp22.3FRAXE Xq28 FRAXF Xq28 146.58 miR-105-1; miR- 2.2 175 Note: *othermicroRNAs located close to HPV16 integration sites were found inrelation to FRA5C, FRA11C, FRA12B and FRA12E. Positions are indicatedaccording to Build 33 of the Human Genome.

Example 4A miR Genes Are Located in or Near Cancer Associated GenomicRegions

Because the miR-FRA-HPV16 association has significance for cancerpathogenesis, miR genes might be involved in malignancies through othermechanisms, such as deletion, amplification, or epigeneticmodifications. Thus, a search was performed for reported genomicalterations in human cancers, located in regions containing miR genes.PubMed was searched for reports of CAGR such as minimal regions ofloss-of-heterozygosity (LOH) suggestive of the presence oftumor-suppressor genes (TSs), minimal regions of amplificationsuggestive of the presence of oncogenes (OGs), and common breakpointregions in or near possible OGs or TSs (see General Methods above).Overall, 98 of 187 (52.5%) miR genes were found to be located in CAGRs(see Tables 6 and 7). Eighty of the miR genes (43%) were found to belocated exactly within minimal regions of LOH or minimal regions ofamplification described in a variety of tumors, such as lung, breast,ovarian, colon, gastric and hepatocellular carcinoma, as well asleukemias and lymphomas (see Tables 6 and 7).

The analysis showed that on chromosome 9, eight of fifteen mapped miRgenes (including six located in clusters), were located inside tworegions of deletion on 9q (Simoneau et al., 1999, Oncogene 7:157-163):the clusters let-7a-1/let-7f-1/let-7d and miR-23b/miR-27b/miR-24-1inside region B at 9q22.3 and miR-181a and miR-199b inside region D at9q33-34.1 (Table 6). Furthermore, five other miR genes were located lessthan 2 Mb from the markers with the highest rate of LOH: miR-31 nearIFNα, miR-204 near D9S15, miR-181 and miR-147 near GSN, and miR-123 nearD9S67.

In breast carcinomas, two different regions of loss at 11 q23,independent from the ATM locus, have been studied extensively: the firstspans about 2 Mb between loci D11S1347 and D11S927; the second islocated between loci D11S1345 and D11S1316 and is estimated at about 1Mb (di Iasio et al., 1999, Oncogene 25:1635-1638). Despite extensiveeffort, the only candidate TS gene found was the PPP2R1B gene, involvedin less than 10% of reported cases (Calin et al., 2002, Proc. Natl.Acad. Sci. USA 99:15524-15529; Wang et al., 1998, Science 282:284-7).Both of these minimal LOH regions contained numerous microRNAs: thecluster miR-34-al/miR-34-a2 in the first and the clustermiR-125b1/let-7a-2/miR-100 in the second.

High frequency LOH at 17p13.3 and relatively low TP53 mutation frequencyin cases of hepatocellular carcinomas (HCC), lung cancers andastrocytomas indicate the presence of other TSs involved in thedevelopment of these tumors. One minimal LOH region correlated with HCC,and located telomeric to TP53 between markers D17S1866 and D17S1574 onchromosome 17, contained three miR genes: miR-22, miR-132, and miR-212.miR-195 is located between ENO3 and TP53 on chromosome 17.

Homozygous deletions (HD) in cancer can indicate the presence of TSs(Huebner et al., 1998, Annu Rev. Genet. 32:7-31), and several miR genesare located in homozygously deleted regions without known TSs. Inaddition to miR-15a and miR-16a located at 13q14 HD region in B-CLL, thecluster miR-99a/let-7c/miR-125b-2 mapped in a 21p11.1 region of HD inlung cancers and miR-32 at 9q31.2 in a region of HD in various types ofcancer. Among the seven regions of LOH and HD on the short arm ofchromosome 3, three of the regions harbor miRs: miR-26a in region AP20,miR-138-1 in region 5 at 3p21.3 and the cluster let-7g/miR-135-1 inregion 3 at 3p21.1-p21.2. The locations of the miR genes/gene clustersare not likely to be random, because it was found that overall, therelative incidence of miRs in both deleted and amplified regions ishighly significant (IRR=4.08, p<0.001 and IRR=3.97, p=0.001,respectively) (Table 4). Thus, these miRs expand the spectrum ofcandidate TSs.

TABLE 6 Examples of microRNAs Located in Minimal Deleted Regions,Minimal Amplified Regions, and Breakpoint Regions Involved in HumanCancers* Location (defining Size Known Chromosome markers) Mb MiR GeneHistotype OG/TS 3p21.1-21.2-D ARP- 7 let-7g/miR-135-1 lung, breast —DRR1 cancer 3p21.3(AP20)-D GOLGA4- 0.75 miR-26a epithelial cancer — VILL3p23-21.31 D3S1768- 12.32 miR-26a; miR-138-1 nasopharyngeal — (MDR2)-DD3S1767 cancer 5q32-D ADRB2- 2.92 miR-145/miR-143 myelodysplastic — ATX1syndrome 9q22.3-D D9S280- 1.46 miR-24-1/mir-27b/miR- urothelial cancerPTC, D9S1809 23b; FANCC let-7a-1/let-7f-1/let-7d 9q33-D D9S1826- 0.4miR-123 NSCLC — D9S158 11q23-q24-D D11S927- 1.994 miR-34a-1/miR-34a-2breast, lung PPP2R1B D11S1347 cancer 11q23-q24-D D11S1345- 1.725miR-125b-1/let-7a- breast, lung, — D11S1328 2/miR-100 ovary, cervixcancer 13q14.3-D D13S272- 0.54 miR-15a/miR-16a B-CLL — D13S25 13q32-33-AstSG15303- 7.15 miR-17/miR-18/miR- follicular — stSG3162419a/miR-20/miR-19b- lymphoma 1/miR-92-1 17p13.3-D D17S1866- 1.899miR-22; miR-132; miR- HCC — D17S1574 212 17p13.3-D ENO3- 2.275 miR-195lung cancer TP53 TP53 17q22-t(8; 17) miR-142s/ miR-142s; miR-142asprolymphocytic c-MYC c-MYC leukemia 17q23-A CLTC- 0.97 miR-21neuroblastoma — PPM1D 20q13-A FLJ33887- 0.55 miR-297-3 colon cancer —ZNF217 21q11.1-D D21S1911- 2.84 miR-99a/let-7c/miR- lung cancer — ANA125b Note: *OG - oncogene; TS - tumor suppressor gene; D - deletedregion; A - amplified region; NSCLC—Non-Small Cell Lung Cancer;HCC—Hepatocellular carcinoma; B-CLL—B-Chronic Lymphocytic Leukemia;PTC - patched homolog (Drosophila); FANCC - Fanconi anemia,complementation group C; PPP2R1B - protein phosphatase 2, regulatorysubunit A (PR 65), β isoform. miR genes in a cluster are separated by aslash.

TABLE 7 MicroRNAs Located in Minimal Deleted Regions, Minimal AmplifiedRegions and Breakpoint Regions Involved in Human Cancers Type of regionPosition Position Chromosome (name) Marker 1 (Mb) Marker 2 (Mb) 01p31 DD1S2638 62.92 ARHI 67.885 01p36.3 D D1S468 3.36 D1s2697 15.23 02q21 DD2S1334 136.66 02q37 D D2S125 241.5 03p21.1-21.2 D ARP 51.5 DRR1 58.503p21.3 D (AP20) GOLGA4 37.25 VILL 38 03p23-21.31 D (MDR2) D3S1768 34.59D3S1767 46.91 03q27 t(3; 11)(q27; q23.1) LAZ3/BCL6 188.75 BOB1/ 110.78OBF1 04p15.3 D D4S1608 18.83 D4S404 23.98 05q31-33 D D5S1480 144.17D5S820 156.1 05q32 D ADRB2 148.23 ATX1 151.15 07q32 D D7S3061 122.84D7S1804 131.25 07q32-q33 D D7S2531 130.35 D7S1804 131.69 08p21 D (MRL1)D8S560 21.61 D8S1820 28.02 08p21 D D8S282 21.42 08p22 D D8S254 16.62SFTP2 22.05 08p23.1 A D8S1819 6.737 D8S550 10.919 09p21 D (LOH) IFNA21.2 D9S171/ 22.07 S1814 09p21 D IFNA 21.5 09p21 D IFNA 21.5 09p21 DCDKN2A, CDKN2B 21.9 09q22 D D9S280 92.47 D9S1809 93.93 09q22.3 D (reg B)D9S12 91.21 D9S180 R 96.03 09q32 D D9S1677 107.35 09q33 D D9S1826 133.88D9S158 134.53 09q33-34.1 D (reg D) GSN 119.45 D9S260 127.09 09q34 DD9S158 134.54 11p15 D D11S2071 0.23 11p15.5 D (LOH11B) HRAS 0.52D11S1363 1.05 11q13 D D11S4946 64.35 D11S4939 64.54 11q22 D D11S940/100.65 CD3D/ 118.7 S1782 D11S4104 11q22.1-23.2 D (MDR3) D11S2017 107.05D11S965 111.3 11q22.3-q25 D D11S1340 116.12 D11S912 128.16 11q22-q23 DD11S2106/ 108.76 D11S1356 117.454 S2220 11q23 D D11S1647 110.34 NCAM2/112.5 NCAM1 11q23 D D11S1345 121.83 D11S1328 123.56 11q23 D D11S1345121.83 D11S1328 123.56 11q23.1-23.2 D (LOH) D11S4167 121.68 D11S4144122.96 11q23-q24 D D11S927 109.676 D11S1347 111.67 (LOH11CR1) 11q23-q24D D11S1345 121.835 D11S1328 123.56 (LOH11CR2) 12p13 t(7; 12)(q36; p13)TEL(ETV6) 11.83 near HLXB9 156.21 12q13-q14 A DGKA 54.6 BLOV1 67.412q13-q15 A GLI 56.15 MDM2 67.5 12q22 D D12S1716 95.45 P382A8AG/ 97.47D12S296 12q22 D D12S377/ 94.1 D12S296 97.47 D12S101 13q14.3 D D13S319/48.5 D13S25 49.04 D13S272 13q14 D D13S260 30.23 AFMa301wb5 48.62 13q14 DRb1 46.77 BCMS 48.46 (DLEU-1) 13q14.3 D (RMD) D13S272 48.5 AF07740148.765 13q14.3 D (Reg II) D13S153 46.68 D13S1289 62.43 13q14.3 D D13S27348.11 D13S176 58.31 13q14.3 D D13S1168 48.28 D13S25 49.04 13q32-33 AstSG15303 89.7 stSG31624 96.85 14q11.1-q12 D D14S283 20.67 D14S64 22.5514q32 D D14S51 95.56 telomere 105.2 15q11.1-15 D D15S128 22.67 D15S101236.72 17p11.2 A PNMT 17.351 17p11.2 D D17S1857 16.61 D17S805/ 20.79 S95917p11.2 D D117S1857 16.61 D17S805/ 20.79 S959 17p13 D D17S578 7.02517p13.3 D D17S1866 0.121 D17S1574 2.02 17p13.3 D ENO3 5.5 TP53 7.77517p13.3 D D17S1574 2.02 D17S379 2.46 17q11.1 D NF1 29.7 17q11.1 D NF129.7 17q11.2 A MLN 62 27.22 (TRAF4) 17q11.2 D (NF1 locus) CYTOR4 29.25WI-12393 30.52 17q22 t(8; 17) “BCL3” 56.95 c-MYC 128.7 17q23 A RAD51C57.116 17q23 A RAD51C 57.116 17q23 A CLTC 58.21 PPM1D 59.18 17q25 A(SRO2) D17S1306 53.76 D17S1604 58.45 19p13.3 D D19S886 0.95 D19S216 4.919p13.3 D (LOH) D19S216 4.9 D19S549 5.44 19p13.3 D (HZYG) D19S894 4.34D19S395 7.32 19p13.3 D (LOH) D19S886 0.95 D19S216 4.9 20q13 A FLJ3388752.2 ZNF217 52.75 20q13.1 A ZNF217 52.285 20q13.2 A D20S854 52.68D20S120 53.69 20q13.2 A ZNF217 52.85 21q11.1 D D21S120/ 15.06 ANA 17.9S1911 21q21 A BIC 25.8 BIC 25.9 22q12.2-q13.33 D D22S280 31.53 D22S27443.54 22q12.3-q13.33 D D22S280 31.53 D22S282 42.1 22q12.1 t(4; 22) MN126.5 Xq25-26.1 D DXS1206 125.08 HPRT 132.31 Size/ miR Distance LocationChromosome (Mb) Histotype Closest miR (Mb) 01p31 4.96 ovarian andmiR-101-1 64.9 breast cancer 01p36.3 0 Non Small Cell miR-34 8.8 LungCa. 02q21 0.1 gastric ca. miR-128a 136.55 02q37 0.2 hepatocellularmiR-149 241.65 carcinoma (HCC) 03p21.1-21.2 7 lung, breast ca.let-7g/miR-135-1 52.3 03p21.3 0.75 epithelial miR-26a 38 malignancies03p23-21.31 12.32 nasopharyngeal miR-26a; miR- 38; 44 carcinoma 138-103q27 B cell leukemia miR-34a-2/miR 110.9 line (Karpas 231) 34a-104p15.3 5.15 primary bladder miR-218-1 20.25 ca. 05q31-33 11.93 prostateca. miR-145/miR- 148.7 aggressiveness 143 05q32 2.92 myelodysplasticmiR-145/miR- 148.7 syndrome 143 07q32 8.41 prostate ca. miR-129-1; miR-127.3; (aggressiveness) 182s/miR- 129; 130 182as/miR- 96/miR-183;miR29a/miR-29b 07q32-q33 1.34 prostate ca. miR-29a/miR- 130(aggressiveness) 29b 08p21 6.41 HCC miR-161; miR- 22; 21.5 177 08p21 0.1HCC miR-177 21.5 08p22 5.43 oral and miR-161; miR- 22; 21.5 laryngeal177 squamous carcinoma. 08p23.1 4.18 malignant miR-124a-1 9.75 fibroushistiocytomas (MFHs) 09p21 0.87 primary bladder miR-31 21.4 tumor 09p210 lung miR-31 21.4 adenocarcinoma 09p21 0 gastric ca. miR-31 21.4 09p210.5 breast ca. miR-31 21.4 09q22 1.46 urothelial ca. miR-24-1/miR- 92.9;92.3 27b/miR-23b; let-7a-1/let- 7f1/let-7d 09q22.3 4.82 bladder ca.let-7a-1/let- 92.3; 92.9 7f1/let-7d; miR- 24-1/miR- 27b/miR-23b 09q320.2 Small Cell Lung miR-32 107.15 Ca., Non-Small Cell Lung Ca. 09q33 0.4NSCLC miR-123 134.95 09q33-34.1 7.64 bladder ca. miR-181a; miR- 122.85;199b 126.3 09q34 0.4 HCC miR-123 134.95 11p15 0.4 ovarian ca. miR-2100.6 11p15.5 0.53 lung ca. miR-210 0.6 11q13 0.19 sporadic miR-159-1/miR-64.45 follicular thyroid 192 tumor 11q22 18.05 lung miR-34a-1/miR- 111adenocarcinoma 34a-2 11q22.1-23.2 4.25 nasopharyngeal miR-34a-1/miR- 111carcinoma 34a-2 11q22.3-q25 12.04 ovarian ca. miR125b-1/let- 121.57a-2/miR-100 11q22-q23 8.7 chronic miR-34a-1/miR- 111 lymphocytic 34a-2leukemia 11q23 2.16 lung ca. miR-34a-1/miR- 111 34a-2 11q23 1.73 lungmiR125b-1/let- 121.5 adenocarcinoma 7a-2/miR-100 11q23 1.73 lungmiR125b-1/let- 121.5 adenocarcinoma 7a-2/miR-100 11q23.1-23.2 1.28cervical ca. miR125b-1/let- 121.5 7a-2/miR-100 11q23-q24 1.994 breast,lung ca. miR-34a-1/miR- 111 34a-2 11q23-q24 1.725 breast, lung,miR125b-1/let- 121.5 ovary, cervix ca. 7a-2/miR-100 12p13 acute myeloidmiR-153-2 156.6 leukemia (AML) 12q13-q14 12.8 adenocarcinomas let-7i61.35 of lung and esophagus 12q13-q15 11.35 bladder ca. let-7i 61.3-.4512q22 2.02 male germ cell miR-135-2 96.5 tumors 12q22 3.37 male germcell miR-135-2 96.5 tumors. 13q14.3 0.54 B-Chronic miR-15a/miR- 48.5Lymphocytic 16a Leuk (B-CLL) 13q14 18.39 adult miR-15a/miR- 48.5Lymphoblastic 16a leukemia 13q14 1.69 lipoma miR-15a/miR16a 48.5 13q14.30.265 CLL miR-15a/miR- 48.5 16a 13q14.3 15.75 head-and-neck miR-15a/miR-48.5 squamous-cell 16a carcinoma 13q14.3 10.2 oral ca. miR-15a/miR- 48.516a 13q14.3 0.76 B-CLL miR-15a/miR- 48.5 16a 13q32-33 7.15 follicularmiR-17/miR- 89.7 Lymphoma 18/miR-19a/miR- 20/miR-19b- 1/miR-92-114q11.1-q12 1.88 malignant miR-208 21.8 mesothelioma 14q32 9.64nasopharyngeal miR-127/miR- 99.3; 99.5; carcinoma 136; miR- 102.5154/miR- 134/miR-299; miR-203 15q11.1-15 14.05 malignant miR-211 29mesothelioma. 17p11.2 0.5 breast ca miR-33b 17.8 17p11.2 4.18 kidney ca(Birt- miR-33b 17.9 Hogg-Dube sy) 17p11.2 4.18 medulloblastoma miR-33b17.9 (Smith-Magenis syndrome) 17p13 0 HCC miR-195 7 17p13.3 1.9 HCCmiR-22; miR- 1.75; 2.2 132; miR-212 17p13.3 2.275 lung ca. miR-195 717p13.3 0.44 lung ca. miR-132; miR- 2.2 212 17q11.1 0.3 ovarian ca.miR-108-1 30 17q11.1 0.3 ovarian ca. miR-193 30 17q11.2 0.1 primarybreast miR-144 27.35 ca. 17q11.2 1.27 NF1 miR-108-1/miR- 30; 30microdeletion 193 17q22 prolymphocytic miR-142s/miR- 56.95 leukemia142as 17q23 0.25 breast ca. miR-142s/miR- 56.95 142as 17q23 0.5 breastca. miR-301 57.7 17q23 0.97 neuroblastoma miR-21 58.45 17q25 4.69 breastca. miR-142s; miR- 56.95; 142as; miR-301; 57.7; miR-21 58.45 19p13.33.95 lung miR-7-3 4.75 adenocarcinoma 19p13.3 0.54 gynecological miR-7-34.75 tumor in Peutz- Jegher's sy 19p13.3 2.98 gynecological miR-7-3 4.75tumor in Peutz- Jegher's sy 19p13.3 3.95 pancreatic and miR-7-3 4.75biliary ca 20q13 0.55 colon ca miR-297-3 52.35 20q13.1 0 ovarianmiR-297-3 52.35 20q13.2 1.01 gastric miR-297-3 52.35 adenocarcinoma20q13.2 0.5 head/neck miR-297-3 52.35 squamous carcinoma 21q11.1 2.84lung ca. (cell line miR-99a/let- 16.8 MA17) 7c/miR-125b-2 21q21 0.1colon ca. miR-155(BIC) 25.85 22q12.2-q13.33 12.01 colorectal ca. miR-33a40.6 22q12.3-q13.33 10.57 astrocytomas miR-33a 40.6 22q12.1 meningiomamiR-180 26.45 Xq25-26.1 7.23 advanced miR-92-2/miR- 132 ovarian ca.19b-2/miR-106a Note: D—deletion; A—amplification; ca.—cancer;sy—syndrome. The distance (in Mb) from the markers used in genome-wideanalysis is shown. miRs in clusters are separated by a slash. Positionsare according to BUILD 34, version 1, of the Human Genome

Example 4B Effect of Genomic Location on miR Gene Expression

In order to investigate whether the genomic location in deleted regionsinfluences miR gene expression, a set of lung cancer cell lines wasanalyzed. miR-26a and miR-99a, located at 3p21 and 21q1.2, respectively,are not expressed or are expressed at low levels in lung cancer celllines. The locations of miR-26a and miR-99a correlate with regions ofLOH/HD in lung tumors. However, the expression of miR-16a (located at13q14) was unchanged in the majority of lung tumor cell lines ascompared to normal lung (see FIG. 1).

Several miR genes are located near breakpoint regions, includingmiR-142s at 50 nt from the t(8; 17) translocation involving chromosome17 and MYC, and miR-180 at 1 kb from the MN1 gene involved in a t(4; 22)translocation in meningioma (Table 6). The t(8; 17) translocation bringsthe MYC gene near the miR gene promoter, with consequent MYCover-expression, while the t(4; 22) translocation inactivates the MN1gene, and possibly inactivates the miR gene located in the sameposition. Other miR genes are located relatively close to chromosomalbreakpoints, such as the cluster miR 34a-1/34a-2 and miR-153-2 (seeTable 7). Further supporting a role for miR-122a in cancer, it was foundherein that human miR-122a is located in the minimal amplicon aroundMALT1 in aggressive marginal zone lymphoma (MZL), and was found to beabout 160 kb from the breakpoint region of translocation t(11; 18) inmucosa-associated lymphoid tissue (MALT) lymphoma (Sanchez-Izquierdo etal., 2003, Blood 101:4539-4546). Apart from miR-122a, several other miRgenes were located in regions particularly prone to cancer-specificabnormalities, such as miR-142s and miR-142 as, located at 17q23 closeto a t(8; 17) breakpoint in B cell acute leukemia, and also locatedwithin the minimal amplicon in breast cancer and near the FRA17B site,which is also a target for HPV16 integration in cervical tumors (seeTables 5 and 7).

Example 5 MicroRNAs are Located in or Near HOX Gene Clusters

Homeobox-containing genes are a family of transcription factor genesthat play crucial roles during normal development and in oncogenesis.HOXB4, HOXB5, HOXC9, HOXC10, HOXD4 and HOXD8, all with miR geneneighbors, are deregulated in a variety of solid and hematopoieticcancers (C₁₋₁₀ et al., 1999, Exp. Cell Res. 248:1-9; Owebs et al., 2002,Stem Cells 20:364-379). A strong correlation was found between thelocation of specific miR genes and homeobox (HOX) genes. The miR-10a andmiR-196-1 genes are located within the HOX B cluster on 17q21, whilemiR-196-2 is within the HOX C cluster at 12q13, and miR-10b maps to theHOX D cluster at 2q31 (see FIG. 2). Moreover, three other miRs (miR-148,miR-152 and miR-148b) are close to HOX clusters (less than 1 Mb; seeFIG. 2). The 1 Mb distance was selected because some form of long-rangecoordinated regulation of gene expression was shown to expand up to onemegabase to HOX clusters (Kamath et al., 2003, Nature 421:231-7). Suchproximity of miR genes to HOX gene clusters is unlikely to have occurredby chance (IRR=15.77; p<0.001) (Table 4). Because collinear expressionof, and cooperation between, HOX genes is well demonstrated, these dataindicated that miRs are altered along with the HOX genes in humancancers.

Next, it was determined whether miR genes were located within class IIHOX gene clusters as well. Fourteen additional human HOX gene clusters(Pollard et al., 2000, Current Biology 10:1059-1062) were analyzed, andseven miR genes (miR-129-1, miR-153-2, let-7a-1, let-7f-1, let-7d,miR-202 and miR-139) were located within 0.5 Mb of class II homeoticgenes, a result which was highly unlikely to occur by chance (IRR=2.95,p<0.001) (Table 4).

Example 6 Expression of miR Gene Products in Human Cells

The cDNA sequence encoding the entire miR precursor transcript of an miRgene is separately cloned into the context of an irrelevant mRNAexpressed under the control of the cytomegalovirus immediate early(CMV-IE) promoter, according to the procedure of Zeng et al., 2002, Mol.Cell. 9:1327-1333, the entire disclosure of which is herein incorporatedby reference.

Briefly, Xho I linkers are placed on the end of double-stranded cDNAsequences encoding an miR precursor, and this construct is separatelycloned into the Xho I site present in the pBC12/CMV plasmid. ThepBC12/CMV plasmid is described in Cullen, 1986, Cell 46:973-982, theentire disclosure of which is herein incorporated by reference.

pCMV plasmid containing the miR precursor coding sequence is transfectedinto cultured human 293T cells by standard techniques using the FuGene 6reagent (Roche). Total RNA is extracted as described above, and thepresence of the processed miR transcript is detected by Northern blotanalysis with an miR probe specific for the miR transcript.

pCMV-miR is also transfected into cultured human normal cells or cellswith proliferative disorders, such as cancer cells. For example, theproliferative disease or cancer cell types include ovarian cancer,breast cancer, small cell lung cancer, sporadic follicular thyroidtumor, chronic lymphocytic leukemia, cervical cancer, acute myeloidleukemia, adenocarcinomas, male germ cell tumor, non-small cell lungcancer, gastric cancer, hepatocellular carcinoma, lung cancer,nasopharyngeal cancer, B-chronic lymphocytic leukemia, lipoma,mesothelioma, kidney cancer, NF1 microdeletion, neuroblastoma,medulloblastoma, pancreatic cancer, biliary cancer, colon cancer,gastric adenocarcinoma, head/neck squamous carcinoma, astrocytoma,meningioma, B cell leukemia, primary bladder cancer, prostate cancer,myelodysplastic syndrome, oral cavity carcinoma, laryngeal squamouscarcinoma, and urothelial cancer. Total RNA is extracted as describedabove, and the presence of processed miR transcripts in the cancer cellsis detected by Northern blot analysis with miR specific probes. Thetransfected cells are also evaluated for changes in morphology, theability to overcome contact inhibition, and other markers indicative ofa transformed phenotype.

Example 7 Preparation of Liposomes Encapsulating miR Gene Products

Liposome Preparation 1—Liposomes composed of lactosyl cerebroside,phosphatidylglycerol, phosphatidylcholine, and cholesterol in molarratios of 1:1:4:5 are prepared by the reverse phase evaporation methoddescribed in U.S. Pat. No. 4,235,871, the entire disclosure of which isherein incorporated by reference. The liposomes are prepared in anaqueous solution of 100 μg/ml processed miR transcripts or 500 μg/mlpCMV-microRNA. The liposomes thus prepared encapsulate either theprocessed microRNA, or the pCMV-microRNA plasmids.

The liposomes are then passed through a 0.4 polycarbonate membrane andsuspended in saline, and are separated from non-encapsulated material bycolumn chromatography in 135 mM sodium chloride, 10 mM sodium phosphate(pH 7.4). The liposomes are used without further modification, or aremodified as described herein.

A quantity of the liposomes prepared above are charged to an appropriatereaction vessel to which is added, with stirring, a solution of 20 mMsodium metaperiodate, 135 mM sodium chloride and 10 mM sodium phosphate(pH 7.4). The resulting mixture is allowed to stand in darkness for 90minutes at a temperature of about 20° C. Excess periodate is removed bydialysis of the reaction mixture against 250 ml of buffered saline (135mM sodium chloride, 10 mM sodium phosphate, pH 7.4) for 2 hours. Theproduct is a liposome having a surface modified by oxidation ofcarbohydrate hydroxyl groups to aldehyde groups. Targeting groups oropsonization inhibiting moieties are conjugated to the liposome surfacevia these aldehyde groups.

Liposome Preparation 2—A second liposome preparation composed ofmaleimidobenzoyl-phosphatidylethanolamine (MBPE), phosphatidylcholineand cholesterol is obtained as follows. MBPE is an activatedphospholipid for coupling sulfhydryl-containing compounds, includingproteins, to the liposomes.

Dimyristoylphosphatidylethanolamine (DMPE) (100 mmoles) is dissolved in5 ml of anhydrous methanol containing 2 equivalents of triethylamine and50 mg of m-maleimidobenzoyl N-hydroxysuccinimide ester, as described inKitagawa et al. (1976), J. Biochem. 79:233-236, the entire disclosure ofwhich is herein incorporated by reference. The resulting reaction isallowed to proceed under a nitrogen gas atmosphere overnight at roomtemperature, and is subjected to thin layer chromatography on Silica gelH in chloroform/methanol/water (65/25/4), which reveals quantitativeconversion of the DMPE to a faster migrating product. Methanol isremoved under reduced pressure and the products re-dissolved inchloroform. The chloroform phase is extracted twice with 1% sodiumchloride and the maleimidobenzoyl-phosphatidylethanolamine (MBPE)purified by silicic acid chromatography with chloroform/methanol (4/1)as the solvent. Following purification, thin-layer chromatographyindicates a single phosphate containing spot that is ninhydrin negative.

Liposomes are prepared with MBPE, phosphatidylcholine and cholesterol inmolar ratios of 1:9:8 by the reverse phase evaporation method of U.S.Pat. No. 4,235,871, supra, in an aqueous solution of 100 μg/ml processedmicroRNA or a solution of 500 μg/ml pCMV-miR (see above). Liposomes areseparated from non-encapsulated material by column chromatography in 100mM sodium chloride-2 mM sodium phosphate (pH 6.0).

Example 8 Attachment of Anti-Tumor Antibodies to Liposomes

An appropriate vessel is charged with 1.1 ml (containing about 10mmoles) of Liposome Preparation 1 (see above) carrying reactive aldehydegroups, or Liposome Preparation 2 (see above). 0.2 ml of a 200 mM sodiumcyanoborohydride solution and 1.0 ml of a 3 mg/ml solution of amonoclonal antibody directed against a tumor cell antigen is added tothe preparation, with stirring. The resulting reaction mixture isallowed to stand overnight while maintained at a temperature of 4° C.The reaction mixture is separated on a Biogel A5M agarose column(Biorad, Richmond, Calif.; 1.5×37 cm).

Example 9 Inhibition of Human Tumor Growth In Vivo with miR GeneProducts

A cancer cell line, such as one of the lung cancer cell lines describedabove or a tumor-derived cell, is inoculated into nude mice, and themice are divided into treatment and control groups. When tumors in themice reach 100 to 250 cubic millimeters, processed miR transcriptsencapsulated in liposomes are injected directly into the tumors of thetest group. The tumors of the control group are injected with liposomesencapsulating carrier solution only. Tumor volume is measured throughoutthe study.

Example 10 Oligonucleotide Microchip for Genome-Wide miRNA Profiling

Introduction

A micro-chip microarray was prepared as follows, containing 368gene-specific oligonucleotide probes generated from 248 miRNAs (161human, 84 mouse, and 3 arabidopsis) and 15 tRNAs (8 human and 7 mouse).These sequences correspond to human and mouse miRNAs found in the miRNARegistry (June 2003) (Griffiths-Jones, S. (2004) Nucleic Acids Res. 32,D109-D111) or collected from published literature (Lagos-Quintana, M.,Rauhut, R., Lendeckel, W. & Tuschl, T. (2001) Science 294, 853-858; Lim,L. P., Glasner, M. E., Yekta, S., Burge, C. B. & Bartel, D. P. (2003)Science 299, 1540; Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A.,Charroux, B., Abel, L., Rappsilber, J., Mann, M. & Dreyfuss, G. (2002)Genes Dev 16, 720-728). For 76 miRNAs, two different oligonucleotideprobes were designed, one containing the active sequence and the otherspecific for the precursor. Using these distinct sequences, we were ableto separately analyze the expression of miRNA and pre-miRNA transcriptsfor the same gene.

Various specificity controls were used to validate data. For intra-assayvalidation, individual oligonucleotide-probes were printed intriplicate. Fourteen oligonucleotides had a total of six replicatesbecause of identical mouse and human sequences and therefore werespotted on both human and mouse sections of the array. Several mouse andhuman orthologs differ only in few bases, serving as controls for thehybridization stringency conditions. tRNAs from both species were alsoprinted on the microchip, providing an internal, relatively stablepositive, control for specific hybridization, while Arabidopsissequences were selected, based on the absence of any homology with knownmiRNAs from other species, and used as controls for non-specifichybridization.

Materials and Methods

The following materials and methods were employed in designing andtesting the microchip.

miRNA Oligonucleotide Probe Design. A total of 281 miRNA precursorsequences (190 Homo sapiens, 88 Mus musculus, and 3 Arabidopsisthaliana) with annotated active sites were selected for oligonucleotidedesign. These correspond to human and mouse miRNAs found in the miRNARegistry or collected from published literature. All of the sequenceswere confirmed by BLAST alignment with the corresponding genome and thehairpin structures were analyzed. When two precursors with differentlength or slightly different base composition for the same miRNAs werefound, both sequences were included in the database and the one thatsatisfied the highest number of design criteria was used. The sequenceswere clustered by organism using the LEADS platform (Sorek, R., Ast, G.& Graur, D. (2002) Genome Research 12, 1060-1067), resulting in 248clusters (84 mouse, 161 human, and 3 arabidopsis). For each cluster, all40-mer oligonucleotides were evaluated for their cross-homology to allgenes of the relevant organism, number of bases in alignment to arepetitive element, amount of low-complexity sequence, maximumhomopolymeric stretch, global and local G+C content, and potentialhairpins (self 5-mers). The best oligonucleotide was selected thatcontained each active site of each miRNA. This produced a total of 259oligonucleotides; there were 11 clusters with multiple annotated activesites. Next, we attempted to design an oligonucleotide that did notcontain the active site for each cluster, when it was possible to choosesuch an oligonucleotide that did not overlap the selectedoligonucleotide(s) by more than 10 nt. To design each of theseadditional oligonucleotides, we required <75% global cross-homology and<20 bases in any 100% alignment to the relevant organism, <16 bases inalignments to repetitive elements, <16 bases of low-complexity,homopolymeric stretches of no more than 6 bases, G+C content between30-70% and no more than 11 windows of size 10 with G+C content outside30-70%, and no self 5-mers. A total of 76 additional oligonucleotideswere designed. In addition, we designed oligonucleotides for 7 mousetRNAs and 8 human tRNAs, using similar design criteria. We selected asingle oligonucleotide for each, with the exception of the human andmouse initiators, Met-tRNA-i, for which we selected two oligonucleotideseach (Table 8).

TABLE 8 Oligonucleotides used for the miRNA microarray chip andcorrespondence with specific human and mouse microRNAs.Oligonucleotide_name Corresponding miRNA Oligonucleotide sequence Coversactive site? Notes SEQ ID NO. ath-miR156a-#1 ath-miR156aTGACAGAAGAGAGTGAGCACACAAAGGCAATTTGCATATC yes 286 ath-miR156a-#2ath-miR156a CATTGCACTTGCTTCTCTTGCGTGCTCACTGCTCTTTCTG no 287ath-miR157a-#1 ath-miR157a GTGTTGACAGAAGATAGAGAGCACAGATGATGAGATACAA yes288 ath-miR157a-#2 ath-miR157a CATCTTACTCCTTTGTGCTCTCTAGCCTTCTGTCATCACCno 289 ath-miR180a-#1 ath-miR180GATGGACGGTGGTGATTCACTCTCCACAAAGTTCTCTATG no 290 ath-miR180a-#2ath-miR180 TGAGAATCTTGATGATGCTGCATCGGCAATCAACGACTAT yes 291hsa-let-7a-1-prec let-7a-1 TGAGGTAGTAGGTTGTATAGTTTTAGGGTCACACCCACCA yes292 hsa-let-7a-2-prec-#1 let-7a-2TACAGCCTCCTAGCTTTCCTTGGGTCTTGCACTAAACAAC no 293 hsa-let-7a-2-prec-#2let-7a-2 ACTGCATGCTCCCAGGTTGAGGTAGTAGGTTGTATAGTTT yes 294hsa-let-7a-3-prec let-7a-3 GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTG yes295 hsa-let-7b-prec let-7b TGAGGTAGTAGGTTGTGTGGTTTCAGGGCAGTGATGTTGC yes296 hsa-let-7c-prec let-7c GCATCCGGGTTGAGGTAGTAGGTTGTATGGTTTAGAGTTA yes297 hsa-let-7d-prec let-7d CCTAGGAAGAGGTAGTAGGTTGCATAGTTTTAGGGCAGGG yes298 hsa-let-7d-v1-prec let-7d (=7d-v1)CTAGGAAGAGGTAGTAGTTTGCATAGTTTTAGGGCAAAGA yes 299 hsa-let-7d-v2-prec-#1let-7i (=let-7d-v2) TTGGTCGGGTTGTGACATTGCCCGCTGTGGAGATAACTGC no 300hsa-let-7d-v2-prec-#2 let-7i (=let-7d-v2)GCTGAGGTAGTAGTTTGTGCTGTTGGTCGGGTTGTGACAT yes idem mmu-let- 301 7i-prechsa-let-7e-prec let-7e GGCTGAGGTAGGAGGTTGTATAGTTGAGGAGGACACCCAA yes 302hsa-let-7f-1-prec-#1 let-7f-1 GGTAGTGATTTTACCCTGTTCAGGAGATAACTATACAATCno 303 hsa-let-7f-1-prec-#2 let-7f-1GGGATGAGGTAGTAGATTGTATAGTTGTGGGGTAGTGATT yes 304 hsa-let-7f-2-prec2let-7f-2 TGAGGTAGTAGATTGTATAGTTTTAGGGTCATACCCCATC yes 305hsa-let-7g-prec-#1 let-7g CTGATTCCAGGCTGAGGTAGTAGTTTGTACAGTTTGAGGG yes306 hsa-let-7g-prec-#2 let-7g TTGAGGGTCTATGATACCACCCGGTACAGGAGATAACTGTno 307 hsa-miR-001b-1-prec1 miR-001AATGCTATGGAATGTAAAGAAGTATGTATTTTTGGTAGGC yes 308 hsa-miR-001b-2-precmiR-001 TAAGCTATGGAATGTAAAGAAGTATGTATCTCAGGCCGGG yes 309hsa-miR-007-1-prec miR-007-1 TGTTGGCCTAGTTCTGTGTGGAAGACTAGTGATTTTGTTGyes 310 hsa-miR-007-2-prec-#1 miR-007-2TACTGCGCTCAACAACAAATCCCAGTCTACCTAATGGTGC no 311 hsa-miR-007-2-prec-#2miR-007-2 GGACCGGCTGGCCCCATCTGGAAGACTAGTGATTTTGTTG yes 312hsa-miR-007-3-prec-#1 miR-007-3 AGATTAGAGTGGCTGTGGTCTAGTGCTGTGTGGAAGACTAno 313 hsa-miR-007-3-prec-#2 miR-007-3TGGAAGACTAGTGATTTTGTTGTTCTGATGTACTACGACA yes 314 hsa-miR-009-1-#1miR-009-1 (miR-131-1) TCTTTGGTTATCTAGCTGTATGAGTGGTGTGGAGTCTTCA yes 315hsa-miR-009-1-#2 miR-009-1 (miR-131-1)TAAAGCTAGATAACCGAAAGTAAAAATAACCCCATACACT yes 316 hsa-miR-009-2-#1miR-009-2 (miR-131-2) GAAGCGAGTTGTTATCTTTGGTTATCTAGCTGTATGAGTG yes 317hsa-miR-009-2-#2 miR-009-2 (miR-131-2)GAGTGTATTGGTCTTCATAAAGCTAGATAACCGAAAGTAA yes idem mmu-miR- 318009-prec-#2 hsa-miR-009-3-#1 miR-009-3 (miR-131-3)GGGAGGCCCGTTTCTCTCTTTGGTTATCTAGCTGTATGAG yes 319 hsa-miR-009-3-#2miR-009-3 (miR-131-3) GTGCCACAGAGCCGTCATAAAGCTAGATAACCGAAAGTAG yes 320hsa-miR-010a-prec-#1 miR-010a GTCTGTCTTCTGTATATACCCTGTAGATCCGAATTTGTGTyes 321 hsa-miR-010a-prec-#2 miR-010aGTGGTCACAAATTCGTATCTAGGGGAATATGTAGTTGACA no 322 hsa-miR-010b-prec-#1miR-010b TACCCTGTAGAACCGAATTTGTGTGGTATCCGTATAGTCA yes 323hsa-miR-010b-prec-#2 miR-010b GTCACAGATTCGATTCTAGGGGAATATATGGTCGATGCAAno 324 hsa-miR-015a-2-prec-#1 miR-15-aCCTTGGAGTAAAGTAGCAGCACATAATGGTTTGTGGATTT yes 325 hsa-miR-015a-2-prec-#2miR-15-a TTTGTGGATTTTGAAAAGGTGCAGGCCATATTGTGCTGCC no 326hsa-miR-015b-prec-#1 miR-015-b GGCCTTAAAGTACTGTAGCAGCACATCATGGTTTACATGCyes 327 hsa-miR-015b-prec-#2 miR-015-bTGCTACAGTCAAGATGCGAATCATTATTTGCTGCTCTAGA no 328 hsa-miR-016a-chr13miR-016-1 CAATGTCAGCAGTGCCTTAGCAGCACGTAAATATTGGCGT yes 329hsa-miR-016b-chr3 miR-016-2 GTTCCACTCTAGCAGCACGTAAATATTGGCGTAGTGAAAT yes330 hsa-miR-017-prec-#1 miR-017 (miR-091)GCATCTACTGCAGTGAAGGCACTTGTAGCATTATGGTGAC yes 331 hsa-miR-017-prec-#2miR-017 (miR-091) GTCAGAATAATGTCAAAGTGCTTACAGTGCAGGTAGTGAT yes 332hsa-miR-018-prec miR-018 TAAGGTGCATCTAGTGCAGATAGTGAAGTAGATTAGCATC yes333 hsa-miR-019a-prec miR-019a TGTAGTTGTGCAAATCTATGCAAAACTGATGGTGGCCTGCyes 334 hsa-miR-019b-1-prec miR-019b-1TTCTGCTGTGCAAATCCATGCAAAACTGACTGTGGTAGTG yes 335 hsa-miR-019b-2-precmiR-019b-2 GTGGCTGTGCAAATCCATGCAAAACTGATTGTGATAATGT yes 336hsa-miR-020-prec miR-020 TAAAGTGCTTATAGTGCAGGTAGTGTTTAGTTATCTACTG yes337 hsa-miR-021-prec-17-#1 miR-021GTCGGGTAGCTTATCAGACTGATGTTGACTGTTGAATCTC yes 338 hsa-miR-021-prec-17-#2miR-021 TTCAACAGTCAACATCAGTCTGATAAGCTACCCGACAAGG yes 339hsa-miR-022-prec miR-022 TGTCCTGACCCAGCTAAAGCTGCCAGTTGAAGAACTGTTG yes340 hsa-miR-023a-prec miR-023a TCCTGTCACAAATCACATTGCCAGGGATTTCCAACCGACCyes 341 hsa-miR-023b-prec miR-023bAATCACATTGCCAGGGATTACCACGCAACCACGACCTTGG yes 342 hsa-miR-024-1-prec-#1miR-024-1 TTTTACACACTGGCTCAGTTCAGCAGGAACAGGAGTCGAG yes 343hsa-miR-024-1-prec-#2 miR-024-1 TCCGGTGCCTACTGAGCTGATATCAGTTCTCATTTTACACyes 344 hsa-miR-024-2-prec miR-024-2AGTTGGTTTGTGTACACTGGCTCAGTTCAGCAGGAACAGG yes 345 hsa-miR-025-precmiR-025 ACGCTGCCCTGGGCATTGCACTTGTCTCGGTCTGACAGTG yes 346hsa-miR-026a-prec-#1 miR-026a TTCAAGTAATCCAGGATAGGCTGTGCAGGTCCCAATGGCCyes 347 hsa-miR-026a-prec-#2 miR-026aTCCCAATGGCCTATCTTGGTTACTTGCACGGGGACGCGGG no 348 hsa-miR-026b-precmiR-026b TTCAAGTAATTCAGGATAGGTTGTGTGCTGTCCAGCCTGT yes 349hsa-miR-027a-prec miR-027a GTCCACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCC yes350 hsa-miR-027b-prec miR-027b CCGCTTTGTTCACAGTGGCTAAGTTCTGCACCTGAAGAGAyes 351 hsa-miR-028-prec miR-028AAGGAGCTCACAGTCTATTGAGTTACCTTTCTGACTTTCC yes 352 hsa-miR-029a-2-#1miR-029a CTAGCACCATCTGAAATCGGTTATAATGATTGGGGAAGAG yes 353hsa-miR-029a-2-#2 miR-029a CCCCTTAGAGGATGACTGATTTCTTTTGGTGTTCAGAGTC no354 hsa-miR-029b-2 = miR-029b (=miR-102-AGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTTGGGG yes 355 102prec7.1=7.2 7.1= 7.2) hsa-miR-029c-prec miR-029cTTTTGTCTAGCACCATTTGAAATCGGTTATGATGTAGGGG yes 356 hsa-miR-030a-prec-#1miR-030a-as GCGACTGTAAACATCCTCGACTGGAAGCTGTGAAGCCACA yes 357hsa-miR-030a-prec-#2 miR-030a-s CACAGATGGGCTTTCAGTCGGATGTTTGCAGCTGCCTACTyes 358 hsa-miR-030b-prec-#1 miR-030bTGTAAACATCCTACACTCAGCTGTAATACATGGATTGGCT yes 359 hsa-miR-030b-prec-#2miR-030b ATGGATTGGCTGGGAGGTGGATGTTTACTTCAGCTGACTT no 360hsa-miR-030c-prec miR-030c TACTGTAAACATCCTACACTCTCAGCTGTGGAAAGTAAGA yes361 hsa-miR-030d-prec-#1 miR-030dTAAGACACAGCTAAGCTTTCAGTCAGATGTTTGCTGCTAC no 362 hsa-miR-030d-prec-#2miR-030d TTGTAAACATCCCCGACTGGAAGCTGTAAGACACAGCTAA yes 363hsa-miR-031-prec miR-031 GGCAAGATGCTGGCATAGCTGTTGAACTGGGAACCTGCTA yes364 hsa-miR-032-prec-#1 miR-032 TGTCACGGCCTCAATGCAATTTAGTGTGTGTGATATTTTCno 365 hsa-miR-032-prec-#2 miR-032GGAGATATTGCACATTACTAAGTTGCATGTTGTCACGGCC yes 366 hsa-miR-033b-precmiR-033b GTGCATTGCTGTTGCATTGCACGTGTGTGAGGCGGGTGCA yes 367hsa-miR-033-prec miR-33 GTGGTGCATTGTAGTTGCATTGCATGTTCTGGTGGTACCC yes 368hsa-miR-034-prec-#1 miR-034 (=miR-170)GAGTGTTTCTTTGGCAGTGTCTTAGCTGGTTGTTGTGAGC yes 369 hsa-miR-034-prec-#2miR-034 (=miR-170) AGTAAGGAAGCAATCAGCAAGTATACTGCCCTAGAAGTGC no 370hsa-miR-092-prec- miR-092-1 ACAGGTTGGGATCGGTTGCAATGCTGTGTTTCTGTATGGT no371 13=092-1-#1 hsa-miR-092-prec- miR-092-1TCTGTATGGTATTGCACTTGTCCCGGCCTGTTGAGTTTGG yes 372 13=092-1-#2hsa-miR-092-prec- miR-092-2 GTTCTATATAAAGTATTGCACTTGTCCCGGCCTGTGGAAG yes373 X=092-2 hsa-miR-093-prec- miR-093-1CCAAAGTGCTGTTCGTGCAGGTAGTGTGATTACCCAACCT yes 374 7.1=093-1hsa-miR-095-prec-4 miR-095 CGTTACATTCAACGGGTATTTATTGAGCACCCACTCTGTG yes375 hsa-miR-096-prec-7-#1 miR-096CTCCGCTCTGAGCAATCATGTGCAGTGCCAATATGGGAAA no 376 hsa-miR-096-prec-7-#2miR-096 TGGCCGATTTTGGCACTAGCACATTTTTGCTTGTGTCTCT yes 377hsa-miR-098-prec-X miR-098 TGAGGTAGTAAGTTGTATTGTTGTGGGGTAGGGATATTAG yes378 hsa-miR-099b-prec-19- miR-099bGCCTTCGCCGCACACAAGCTCGTGTCTGTGGGTCCGTGTC no idem mmu-miR- 379 #1099b-prec-#1 hsa-miR-099b-prec-19- miR-099bCACCCGTAGAACCGACCTTGCGGGGCCTTCGCCGCACACA yes idem mmu-miR- 380 #2099b-prec-#2 hsa-miR-099-prec-21 miR-099a (=miR-099-ATAAACCCGTAGATCCGATCTTGTGGTGAAGTGGACCGCA yes 381 prec21)hsa-miR-100-1/2-prec miR-100 TGAGGCCTGTTGCCACAAACCCGTAGATCCGAACTTGTGGyes 382 hsa-miR-101-1/2-prec-#1 miR-101-1CCCTGGCTCAGTTATCACAGTGCTGATGCTGTCTATTCTA no 383 hsa-miR-101-1/2-prec-#2miR-101-1 TACAGTACTGTGATAACTGAAGGATGGCAGCCATCTTACC yes 384hsa-miR-101-prec-9 miR-101-2 GCTGTATATCTGAAAGGTACAGTACTGTGATAACTGAAGAyes 385 hsa-miR-102-prec-1 miR-102TCTTTGTATCTAGCACCATTTGAAATCAGTGTTTTAGGAG yes 386 hsa-miR-103-2-precmiR-103-2 GTAGCATTCAGGTCAAGCAACATTGTACAGGGCTATGAAA yes 387hsa-miR-103-prec- miR-103-1 (=miR-103-TATGGATCAAGCAGCATTGTACAGGGCTATGAAGGCATTG yes 388 5=103-1 5)hsa-miR-105-prec- miR-105-1 (=miR-105-ATCGTGGTCAAATGCTCAGACTCCTGTGGTGGCTGCTCAT yes 389 X.1=105-1 prec-X)hsa-miR-106-prec-X miR-106a CCTTGGCCATGTAAAAGTGCTTACAGTGCAGGTAGCTTTT yes390 hsa-miR-107-prec-10 miR-107 GGCATGGAGTTCAAGCAGCATTGTACAGGGCTATCAAAGCyes 391 hsa-miR-122a-prec miR-122aCCTTAGCAGAGCTGTGGAGTGTGACAATGGTGTTTGTGTC yes 392 hsa-miR-123-prec-#1miR-123 = miR-126 GACGGGACATTATTACTTTTGGTACGCGCTGTGACACTTC yes 393hsa-miR-123-prec-#2 miR-123 = miR-126TGTGACACTTCAAACTCGTACCGTGAGTAATAATGCGCCG yes 394 hsa-miR-124a-1-prec1miR-124a-1 ATACAATTAAGGCACGCGGTGAATGCCAAGAATGGGGCTG yes 395hsa-miR-124a-2-prec miR-124a-2 TTAAGGCACGCGGTGAATGCCAAGAGCGGAGCCTACGGCTyes 396 hsa-miR-124a-3-prec miR-124a-3TTAAGGCACGCGGTGAATGCCAAGAGAGGCGCCTCCGCCG yes 397 hsa-miR-125a-prec-#1miR-125a TCTAGGTCCCTGAGACCCTTTAACCTGTGAGGACATCCAG yes 398hsa-miR-125a-prec-#2 miR-125a CAGGGTCACAGGTGAGGTTCTTGGGAGCCTGGCGTCTGGCno 399 hsa-miR-125b-1 miR-125b-1TCCCTGAGACCCTAACTTGTGATGTTTACCGTTTAAATCC yes 400 hsa-miR-125b-2-prec-#1miR-125b-2 TAGTAACATCACAAGTCAGGCTCTTGGGACCTAGGCGGAG no 401hsa-miR-125b-2-prec-#2 miR-125b-2ACCAGACTTTTCCTAGTCCCTGAGACCCTAACTTGTGAGG yes 402 hsa-miR-127-precmiR-127 TCGGATCCGTCTGAGCTTGGCTGGTCGGAAGTCTCATCAT yes 403hsa-miR-128a-prec-#1 miR-128a TTGGATTCGGGGCCGTAGCACTGTCTGAGAGGTTTACATTno idem mmu-miR- 404 128-prec-#2 hsa-miR-128a-prec-#2 miR-128aACATTTCTCACAGTGAACCGGTCTCTTTTTCAGCTGCTTC yes 405 hsa-miR-128b-prec-#1miR-128b TCACAGTGAACCGGTCTCTTTCCCTACTGTGTCACACTCC yes 406hsa-miR-128b-prec-#2 miR-128b GGGGGCCGATACACTGTACGAGAGTGAGTAGCAGGTCTCAno 407 hsa-miR-129-prec-#1 miR-129-1/2TGGATCTTTTTGCGGTCTGGGCTTGCTGTTCCTCTCAACA yes 408 hsa-miR-129-prec-#2miR-129-1/2 CCTCTCAACAGTAGTCAGGAAGCCCTTACCCCAAAAAGTA no 409hsa-miR-130a-prec-#1 miR-130a CCAGAGCTCTTTTCACATTGTGCTACTGTCTGCACCTGTCno 410 hsa-miR-130a-prec-#2 miR-130aTGTCTGCACCTGTCACTAGCAGTGCAATGTTAAAAGGGCA yes 411 hsa-miR-132-prec-#1miR-132 TGTGGGAACTGGAGGTAACAGTCTACAGCCATGGTCGCCC yes 412hsa-miR-132-prec-#2 miR-132 TCCAGGGCAACCGTGGCTTTCGATTGTTACTGTGGGAACT no413 hsa-miR-133a-1 miR-133a-1 (=miR-CCTCTTCAATGGATTTGGTCCCCTTCAACCAGCTGTAGCT yes 414 133c) hsa-miR-133a-2miR-133a-2 (=miR- TTGGTCCCCTTCAACCAGCTGTAGCTGTGCATTGATGGCG yes 415 133d)hsa-miR-134-prec-#1 miR-134 ATGCACTGTGTTCACCCTGTGGGCCACCTAGTCACCAACC no416 hsa-miR-134-prec-#2 miR-134 GTGTGTGACTGGTTGACCAGAGGGGCATGCACTGTGTTCAyes 417 hsa-miR-135-1-prec miR-135-1 (=miR-135)GCCTCGCTGTTCTCTATGGCTTTTTATTCCTATGTGATTC yes 418 hsa-miR-135-2-precmiR-135-2 CACTCTAGTGCTTTATGGCTTTTTATTCCTATGTGATAGT yes 419hsa-miR-136-prec-#1 miR-136 ATGCTCCATCATCGTCTCAAATGAGTCTTCAGAGGGTTCT no420 hsa-miR-136-prec-#2 miR-136 TGAGCCCTCGGAGGACTCCATTTGTTTTGATGATGGATTCyes 421 hsa-miR-137-prec miR-137GGATTACGTTGTTATTGCTTAAGAATACGCGTAGTCGAGG yes idem mmu-miR- 422 137-prechsa-miR-138-1-prec miR-138-1 AGCTGGTGTTGTGAATCAGGCCGTTGCCAATCAGAGAACGyes 423 hsa-miR-138-2-prec miR-138-2AGCTGGTGTTGTGAATCAGGCCGACGAGCAGCGCATCCTC yes idem mmu-miR- 424 138-prechsa-miR-139-prec miR-139 GTGTATTCTACAGTGCACGTGTCTCCAGTGTGGCTCGGAG yes425 hsa-miR-140-#1 miR-140-as GCCAGTGGTTTTACCCTATGGTAGGTTACGTCATGCTGTTno 426 hsa-miR-140-#2 miR-140-asTTCTACCACAGGGTAGAACCACGGACAGGATACCGGGGCA yes 427 hsa-miR-141-prec-#1miR-141 TTGTGAAGCTCCTAACACTGTCTGGTAAAGATGGCTCCCG yes 428hsa-miR-141-prec-#2 miR-141 ATCTTCCAGTACAGTGTTGGATGGTCTAATTGTGAAGCTC no429 hsa-miR-142-prec miR-142-as CCCATAAAGTAGAAAGCACTACTAACAGCACTGGAGGGTGyes idem mmu-miR- 430 142-prec hsa-miR-143-prec miR-143CTGGTCAGTTGGGAGTCTGAGATGAAGCACTGTAGCTCAG yes 431 hsa-miR-144-prec-#1miR-144 CGATGAGACACTACAGTATAGATGATGTACTAGTCCGGGC yes 432hsa-miR-144-prec-#2 miR-144 CCCTGGCTGGGATATCATCATATACTGTAAGTTTGCGATG no433 hsa-miR-145-prec miR-145 CCTCACGGTCCAGTTTTCCCAGGAATCCCTTAGATGCTAAyes 434 hsa-miR-146-prec miR-146TGAGAACTGAATTCCATGGGTTGTGTCAGTGTCAGACCTC yes 435 hsa-miR-147-precmiR-147 GACTATGGAAGCCAGTGTGTGGAAATGCTTCTGCTAGATT yes 436hsa-miR-148-prec miR-148 TGAGTATGATAGAAGTCAGTGCACTACAGAACTTTGTCTC yes437 hsa-miR-149-prec miR-149 CGAGCTCTGGCTCCGTGTCTTCACTCCCGTGCTTGTCCGAyes 438 hsa-miR-150-prec miR-150CTCCCCATGGCCCTGTCTCCCAACCCTTGTACCAGTGCTG yes 439 hsa-miR-151-precmiR-151 GTATGTCTCATCCCCTACTAGACTGAAGCTCCTTGAGGAC yes 440hsa-miR-152-prec-#1 miR-152 ACTCGGGCTCTGGAGCAGTCAGTGCATGACAGAACTTGGG yesidem mmu-miR- 441 152-prec hsa-miR-152-prec-#2 miR-152CCCCGGCCCAGGTTCTGTGATACACTCCGACTCGGGCTCT no 442 hsa-miR-153-1-prec1miR-153-1 CAGTTGCATAGTCACAAAAGTGATCATTGGCAGGTGTGGC yes 443hsa-miR-153-1-prec2 miR-153-1 CACAGCTGCCAGTGTCATTGTCACAAAAGTGATCATTGGCyes 444 hsa-miR-153-2-prec miR-153-2GCCCAGTTGCATAGTCACAAAAGTGATCATTGGAAACTGT yes 445 hsa-miR-154-prec1-#1miR-154 GTGGTACTTGAAGATAGGTTATCCGTGTTGCCTTCGCTTT yes 446hsa-miR-154-prec1-#2 miR-154 GCCTTCGCTTTATTTGTGACGAATCATACACGGTTGACCT no447 hsa-miR-155-prec miR-155(BIC)TTAATGCTAATCGTGATAGGGGTTTTTGCCTCCAACTGAC yes 448 hsa-miR-181a-prec-#1miR-181a (=miR-178-2) TCAGAGGACTCCAAGGAACATTCAACGCTGTCGGTGAGTT yes 449hsa-miR-181a-prec-#2 miR-181a (=miR-178-2)GAAAAAACCACTGACCGTTGACTGTACCTTGGGGTCCTTA no 450 hsa-miR-181b-prec-#1miR-181b (=miR-178) TGAGGTTGCTTCAGTGAACATTCAACGCTGTCGGTGAGTT yes 451hsa-miR-181b-prec-#2 miR-181b (=miR-178)ACCATCGACCGTTGATTGTACCCTATGGCTAACCATCATC yes 452 hsa-miR-181c-prec-#1miR-181c TGCCAAGGGTTTGGGGGAACATTCAACCTGTCGGTGAGTT yes 453hsa-miR-181c-prec-#2 miR-181c ATCGACCGTTGAGTGGACCCTGAGGCCTGGAATTGCCATCno 454 hsa-miR-182-prec-#1 miR-182-sAGGTAACAGGATCCGGTGGTTCTAGACTTGCCAACTATGG no 455 hsa-miR-182-prec-#2miR-182-s TTGGCAATGGTAGAACTCACACTGGTGAGGTAACAGGATC yes 456hsa-miR-183-prec-#1 miR-183 (=miR-174)GACTCCTGTTCTGTGTATGGCACTGGTAGAATTCACTGTG yes 457 hsa-miR-183-prec-#2miR-183 (=miR-174) GTCTCAGTCAGTGAATTACCGAAGGGCCATAAACAGAGCA no 458hsa-miR-184-prec-#1 miR-184 GACTGTAAGTGTTGGACGGAGAACTGATAAGGGTAGGTGA yes459 hsa-miR-184-prec-#2 miR-184 CGTCCCCTTATCACTTTTCCAGCCCAGCTTTGTGACTGTAno 460 hsa-miR-185-prec-#1 miR-185GCGAGGGATTGGAGAGAAAGGCAGTTCCTGATGGTCCCCT yes 461 hsa-miR-185-prec-#2miR-185 CCTCCCCAGGGGCTGGCTTTCCTCTGGTCCTTCCCTCCCA no 462 hsa-miR-186-precmiR-186 CTTGTAACTTTCCAAAGAATTCTCCTTTTGGGCTTTCTGG yes 463hsa-miR-187-prec-#1 miR-187 CTCGTGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCA yes464 hsa-miR-187-prec-#2 miR-187 TCACCATGACACAGTGTGAGACTCGGGCTACAACACAGGAno 465 hsa-miR-188-prec miR-188 TCACATCCCTTGCATGGTGGAGGGTGAGCTTTCTGAAAACyes 466 hsa-miR-190-prec miR-190GCAGGCCTCTGTGTGATATGTTTGATATATTAGGTTGTTA yes 467 hsa-miR-191-precmiR-191 CAACGGAATCCCAAAAGCAGCTGTTGTCTCCAGAGCATTC yes idem mmu-miR- 468191-prec hsa-miR-192-2/3-#1 miR-192TCTGACCTATGAATTGACAGCCAGTGCTCTCGTCTCCCCT yes 469 hsa-miR-192-2/3-#2miR-192 CCAATTCCATAGGTCACAGGTATGTTCGCCTCAATGCCAG no 470hsa-miR-193-prec-#1 miR-193 AGATGAGGGTGTCGGATCAACTGGCCTACAAAGTCCCAGT yes471 hsa-miR-193-prec-#2 miR-193 AGGATGGGAGCTGAGGGCTGGGTCTTTGCGGGCGAGATGAno 472 hsa-miR-194-prec-#1 miR-194TGTAACAGCAACTCCATGTGGACTGTGTACCAATTTCCAG yes 473 hsa-miR-194-prec-#2miR-194 CCAATTTCCAGTGGAGATGCTGTTACTTTTGATGGTTACC no 474 hsa-miR-195-precmiR-195 TCTAGCAGCACAGAAATATTGGCACAGGGAAGCGAGTCTG yes 475hsa-miR-196-1-prec-#1 miR-196-1 CTGCTGAGTGAATTAGGTAGTTTCATGTTGTTGGGCCTGGyes 476 hsa-miR-196-1-prec-#2 miR-196-1ACACAACAACATTAAACCACCCGATTCACGGCAGTTACTG no 477 hsa-miR-196-2-prec-#1miR-196-2 AGAAACTGCCTGAGTTACATCAGTCGGTTTTCGTCGAGGG no 478hsa-miR-196-2-prec-#2 miR-196-2 GCTGATCTGTGGCTTAGGTAGTTTCATGTTGTTGGGATTGyes 479 hsa-miR-197-prec miR-197TAAGAGCTCTTCACCCTTCACCACCTTCTCCACCCAGCAT yes 480 hsa-miR-198-precmiR-198 TCATTGGTCCAGAGGGGAGATAGGTTCCTGTGATTTTTCC yes 481hsa-miR-199a-1-prec miR-199a-1(=199s)GCCAACCCAGTGTTCAGACTACCTGTTCAGGAGGCTCTCA yes 482 hsa-miR-199a-2-precmiR-199a-2 TCGCCCCAGTGTTCAGACTACCTGTTCAGGACAATGCCGT yes 483hsa-miR-199b-prec-#1 miR-199b GTCTGCACATTGGTTAGGCTGGGCTGGGTTAGACCCTCGGno 484 hsa-miR-199b-prec-#2 miR-199bACCTCCACTCCGTCTACCCAGTGTTTAGACTATCTGTTCA yes 485 hsa-miR-200a-precmiR-200a GTCTCTAATACTGCCTGGTAATGATGACGGCGGAGCCCTG yes 486hsa-miR-202-prec miR-202 GATCTGGCCTAAAGAGGTATAGGGCATGGGAAGATGGAGC yes487 hsa-miR-203-prec-#1 miR-203 GTTCTGTAGCGCAATTGTGAAATGTTTAGGACCACTAGACyes 488 hsa-miR-203-prec-#2 miR-203TGGGTCCAGTGGTTCTTAACAGTTCAACAGTTCTGTAGCG no 489 hsa-miR-204-prec-#1miR-204 CGTGGACTTCCCTTTGTCATCCTATGCCTGAGAATATATG yes 490hsa-miR-204-prec-#2 miR-204 AGGCTGGGAAGGCAAAGGGACGTTCAATTGTCATCACTGG no491 hsa-miR-205-prec miR-205 TCCTTCATTCCACCGGAGTCTGTCTCATACCCAACCAGATyes 492 hsa-miR-206-prec-#1 miR-206TTGCTATGGAATGTAAGGAAGTGTGTGGTTTCGGCAAGTG yes 493 hsa-miR-206-prec-#2miR-206 TGCTTCCCGAGGCCACATGCTTCTTTATATCCCCATATGG no 494 hsa-miR-208-precmiR-208 ACCTGATGCTCACGTATAAGACGAGCAAAAAGCTTGTTGG yes 495hsa-miR-210-prec miR-210 AGACCCACTGTGCGTGTGACAGCGGCTGATCTGTGCCTGG yes496 hsa-miR-211-prec-#1 miR-211 TTCCCTTTGTCATCCTTCGCCTAGGGCTCTGAGCAGGGCAyes 497 hsa-miR-211-prec-#2 miR-211GCAGGGACAGCAAAGGGGTGCTCAGTTGTCACTTCCCACA no 498 hsa-miR-212-prec-#1miR-212 CCTCAGTAACAGTCTCCAGTCACGGCCACCGACGCCTGGC yes 499hsa-miR-212-prec-#2 miR-212 CGGACAGCGCGCCGGCACCTTGGCTCTAGACTGCTTACTG no500 hsa-miR-213-prec-#1 miR-213 AACATTCATTGCTGTCGGTGGGTTGAACTGTGTGGACAAGyes idem mmu-miR- 501 213-prec hsa-miR-213-prec-#2 miR-213TGTGGACAAGCTCACTGAACAATGAATGCAACTGTGGCCC no 502 hsa-miR-214-prec miR-214TGTACAGCAGGCACAGACAGGCAGTCACATGACAACCCAG yes idem mmu-miR- 503 214-prechsa-miR-215-prec-#1 miR-215 CAGGAAAATGACCTATGAATTGACAGACAATATAGCTGAG yes504 hsa-miR-215-prec-#2 miR-215 CATTTCTTTAGGCCAATATTCTGTATGACTGTGCTACTTCno 505 hsa-miR-216-prec-#1 miR-216CTGGGATTATGCTAAACAGAGCAATTTCCTAGCCCTCACG no 506 hsa-miR-216-prec-#2miR-216 GATGGCTGTGAGTTGGCTTAATCTCAGCTGGCAACTGTGA yes 507hsa-miR-217-prec-#1 miR-217 GAATCAGTCACCATCAGTTCCTAATGCATTGCCTTCAGCA no508 hsa-miR-217-prec-#2 miR-217 TGTCGCAGATACTGCATCAGGAACTGATTGGATAAGAATCyes 509 hsa-miR-218-1-prec miR-218-1GTTGTGCTTGATCTAACCATGTGGTTGCGAGGTATGAGTA yes 510 hsa-miR-218-2-prec-#1miR-218-2 TGGTGGAACGATGGAAACGGAACATGGTTCTGTCAAGCAC no 511hsa-miR-218-2-prec-#2 miR-218-2 TCGCTGCGGGGCTTTCCTTTGTGCTTGATCTAACCATGTGyes 512 hsa-miR-219-prec miR-219ATTGTCCAAACGCAATTCTCGAGTCTATGGCTCCGGCCGA yes 513 hsa-miR-220-precmiR-220 TGTGGCATTGTAGGGCTCCACACCGTATCTGACACTTTGG yes 514hsa-miR-221-prec miR-221 CAACAGCTACATTGTCTGCTGGGTTTCAGGCTACCTGGAA yesidem mmu-miR- 515 221-prec-#1 hsa-miR-222-prec-#1 miR-222CTTTCGTAATCAGCAGCTACATCTGGCTACTGGGTCTCTG yes 516 hsa-miR-222-prec-#2miR-222 GCTGCTGGAAGGTGTAGGTACCCTCAATGGCTCAGTAGCC no 517 hsa-miR-223-precmiR-223 GAGTGTCAGTTTGTCAAATACCCCAAGTGCGGCACATGCT yes 518hsa-miR-224-prec miR-224 GGCTTTCAAGTCACTAGTGGTTCCGTTTAGTAGATGATTG yes519 HSHELA01 — GGCCGCAGCAACCTCGGTTCGTATCCGAGTCACGGCACCA — 520 HSTRNL —TCCGGATGGAGCGTGGGTTCGAATCCCACTTCTGACACCA — 521 HUMTRAB —ATGGTAGAGCGCTCGCTTTGCTTGCGAGAGGTAGCGGGAT — 522 HUMTRF —GATCTAAAGGTCCCTGGTTCGATCCCGGGTTTCGGCACCA — 523 HUMTRMI-#1 —AGCAGAGTGGCGCAGCGGAAGCGTGCTGGGCCCATAACCC — idem 524 MUSTRMI-#1HUMTRMI-#2 — AACCCAGAGGTCGATGGATCGAAACCATCCTCTGCTACCA — 525 HUMTRN —CAATCGGTTAGCGCGTTCGGCTGTTAACCGAAAGGTTGGT — 526 HUMTRS —TCTAGCGACAGAGTGGTTCAATTCCACCTTTCGGGCGCCA — 527 HUMTRV1A —ACGCGAAAGGTCCCCGGTTCGAAACCGGGCGGAAACACCA — 528 mmu-let-7g-precmmu-let-7g CTGAGGTAGTAGTTTGTACAGTTTGAGGGTCTATGATACC yes 529mmu-let-7i-prec mmu-let-7i GCTGAGGTAGTAGTTTGTGCTGTTGGTCGGGTTGTGACAT yesidem hsa-let-7d- 530 v2-prec-#2 mmu-miR-001b-prec mmu-miR-001bATTCAGTGCTATGGAATGTAAAGAAGTATGTATTTTGGGT yes 531 mmu-miR-001d-precmmu-miR-001d CTGCTAAGCTATGGAATGTAAAGAAGTATGTATTTCAGGC yes 532mmu-miR-009-prec-#1 mmu-miR-009 ATCTTTGGTTATCTAGCTGTATGAGTGTATTGGTCTTCATyes 533 mmu-miR-009-prec-#2 mmu-miR-009-GAGTGTATTGGTCTTCATAAAGCTAGATAACCGAAAGTAA yes idem hsa-miR- 534 009-2-#2mmu-miR-010b-prec mmu-miR-010b TACCCTGTAGAACCGAATTTGTGTGGTACCCACATAGTCAyes 535 mmu-miR-023b-prec mmu-miR-023bTTGAGATTAAAATCACATTGCCAGGGATTACCACGCAACC yes 536 mmu-miR-027b-precmmu-miR-027b TTGGTTTCCGCTTTGTTCACAGTGGCTAAGTTCTGCACCT yes 537mmu-miR-029b-prec mmu-miR-029b TAAATAGTGATTGTCTAGCACCATTTGAAATCAGTGTTCTyes 538 mmu-miR-030b-prec mmu-miR-030bTGTAAACATCCTACACTCAGCTGTCATACATGCGTTGGCT yes 539 mmu-miR-030e-precmmu-miR-030e TGTAAACATCCTTGACTGGAAGCTGTAAGGTGTTGAGAGG yes 540mmu-miR-099a-prec mmu-miR-099a CATAAACCCGTAGATCCGATCTTGTGGTGAAGTGGACCGCyes 541 mmu-miR-099b-prec-#1 mmu-miR-099bGCCTTCGCCGCACACAAGCTCGTGTCTGTGGGTCCGTGTC no idem hsa-miR- 542099b-prec-19-#1 mmu-miR-099b-prec-#2 mmu-miR-099bCACCCGTAGAACCGACCTTGCGGGGCCTTCGCCGCACACA yes idem hsa-miR- 543099b-prec-19-#2 mmu-miR-100-prec mmu-miR-100TGCCACAAACCCGTAGATCCGAACTTGTGCTGATTCTGCA yes 544 mmu-miR-101-precmmu-miR-101 GCTGTCCATTCTAAAGGTACAGTACTGTGATAACTGAAGG yes 545mmu-miR-122a-prec-#1 mmu-miR-122aGTGTCCAAACCATCAAACGCCATTATCACACTAAATAGCT no 546 mmu-miR-122a-prec-#2mmu-miR-122a GCTGTGGAGTGTGACAATGGTGTTTGTGTCCAAACCATCA yes 547mmu-miR-123-prec-#1 mmu-miR-123 CATTATTACTTTTGGTACGCGCTGTGACACTTCAAACTCGyes 548 mmu-miR-123-prec-#2 mmu-miR-123GACACTTCAAACTCGTACCGTGAGTAATAATGCGCGGTCA yes 549 mmu-miR-124a-precmmu-miR-124a TAATGTCTATACAATTAAGGCACGCGGTGAATGCCAAGAG yes 550mmu-miR-125a-prec mmu-miR-125a TCCCTGAGACCCTTTAACCTGTGAGGACGTCCAGGGTCACyes 551 mmu-miR-125b-prec-#1 mmu-miR-125bGCCTAGTCCCTGAGACCCTAACTTGTGAGGTATTTTAGTA yes 552 mmu-miR-125b-prec-#2mmu-miR-125b ATTTTAGTAACATCACAAGTCAGGTTCTTGGGACCTAGGC no 553mmu-miR-127-prec mmu-miR-127 TTCAGAAAGATCATCGGATCCGTCTGAGCTTGGCTGGTCGyes 554 mmu-miR-128-prec-#1 mmu-miR-128AGGTTTACATTTCTCACAGTGAACCGGTCTCTTTTTCAGC yes 555 mmu-miR-128-prec-#2mmu-miR-128 TTGGATTCGGGGCCGTAGCACTGTCTGAGAGGTTTACATT no idem hsa-miR-556 128a-prec-#1 mmu-miR-129b-prec mmu-miR-129bCTTTTTGCGGTCTGGGCTTGCTGTACATAACTCAATAGCC yes 557 mmu-miR-129-precmmu-miR-129 CTTTTTGCGGTCTGGGCTTGCTGTTTTCTCGACAGTAGTC yes 558mmu-miR-130-prec mmu-miR-130 GTCTAACGTGTACCGAGCAGTGCAATGTTAAAAGGGCATCyes 559 mmu-miR-131-3-prec mmu-miR-131-3AGTGGTGTGGAGTCTTCATAAAGCTAGATAACCGAAAGTA yes 560 mmu-miR-132-precmmu-miR-132 TGTGGGAACCGGAGGTAACAGTCTACAGCCATGGTCGCCC yes 561mmu-miR-133-prec mmu-miR-133 ATCGCCTCTTCAATGGATTTGGTCCCCTTCAACCAGCTGTyes 562 mmu-miR-134-prec-#1 mmu-miR-134GCACTCTGTTCACCCTGTGGGCCACCTAGTCACCAACCCT no 563 mmu-miR-134-prec-#2mmu-miR-134 TGTGTGACTGGTTGACCAGAGGGGCGTGCACTCTGTTCAC yes 564mmu-miR-135-prec mmu-miR-135 CTATGGCTTTTTATTCCTATGTGATTCTATTGCTCGCTCAyes 565 mmu-miR-136-prec mmu-miR-136GAGGACTCCATTTGTTTTGATGATGGATTCTTAAGCTCCA yes 566 mmu-miR-137-precmmu-miR-137 GGATTACGTTGTTATTGCTTAAGAATACGCGTAGTCGAGG yes idem hsa-miR-567 137-prec mmu-miR-138-prec mmu-miR-138AGCTGGTGTTGTGAATCAGGCCGACGAGCAGCGCATCCTC yes idem hsa-miR- 568138-2-prec mmu-miR-140s-prec mmu-miR-140sTTACGTCATGCTGTTCTACCACAGGGTAGAACCACGGACA yes 569 mmu-miR-141-precmmu-miR-141 GAAGTATGAAGCTCCTAACACTGTCTGGTAAAGATGGCCC yes 570mmu-miR-142-prec mmu-miR-142 CCCATAAAGTAGAAAGCACTACTAACAGCACTGGAGGGTGyes idem hsa-miR- 571 142-prec mmu-miR-143-prec mmu-miR-143TGGTCAGTTGGGAGTCTGAGATGAAGCACTGTAGCTCAGG yes 572 mmu-miR-144-precmmu-miR-144 GTTTGTGATGAGACACTACAGTATAGATGATGTACTAGTC yes 573mmu-miR-145-prec mmu-miR-145 ACGGTCCAGTTTTCCCAGGAATCCCTTGGATGCTAAGATGyes 574 mmu-miR-146-prec mmu-miR-146TGAGAACTGAATTCCATGGGTTATATCAATGTCAGACCTG yes 575 mmu-miR-149-precmmu-miR-149 GCTCTGGCTCCGTGTCTTCACTCCCGTGTTTGTCCGAGGA yes 576mmu-miR-150-prec mmu-miR-150 TGTCTCCCAACCCTTGTACCAGTGCTGTGCCTCAGACCCTyes 577 mmu-miR-151-prec mmu-miR-151TATGTCTCCTCCCTACTAGACTGAGGCTCCTTGAGGGACA yes 578 mmu-miR-152-precmmu-miR-152 ACTCGGGCTCTGGAGCAGTCAGTGCATGACAGAACTTGGG yes idem hsa-miR-579 152-prec-#1 mmu-miR-153-prec mmu-miR-153TAATATGAGCCCAGTTGCATAGTGACAAAAGTGATCATTG yes 580 mmu-miR-154-precmmu-miR-154 AGATAGGTTATCCGTGTTGCCTTCGCTTTATTCGTGACGA yes 581mmu-miR-155-prec mmu-miR-155 TTAATGCTAATTGTGATAGGGGTTTTGGCCTCTGACTGACyes 582 mmu-miR-181-prec mmu-miR-181CCATGGAACATTCAACGCTGTCGGTGAGTTTGGGATTCAA yes 583 mmu-miR-182-precmmu-miR-182 TTTGGCAATGGTAGAACTCACACCGGTAAGGTAATGGGAC yes 584mmu-miR-183-prec-#1 mmu-miR-183 AACAGTCTCAGTCAGTGAATTACCGAAGGGCCATAAACAGno 585 mmu-miR-183-prec-#2 mmu-miR-183TATGGCACTGGTAGAATTCACTGTGAACAGTCTCAGTCAG yes 586 mmu-miR-184-precmmu-miR-184 TGTGACTCTAAGTGTTGGACGGAGAACTGATAAGGGTAGG yes 587mmu-miR-185-prec mmu-miR-185 GGGATTGGAGAGAAAGGCAGTTCCTGATGGTCCCCTCCCAyes 588 mmu-miR-186-prec mmu-miR-186CAAAGAATTCTCCTTTTGGGCTTTCTCATTTTATTTTAAG yes 589 mmu-miR-187-precmmu-miR-187 GGGCGCTGCTCTGACCCCTCGTGTCTTGTGTTGCAGCCGG yes 590mmu-miR-188-prec mmu-miR-188 TCACATCCCTTGCATGGTGGAGGGTGAGCTCTCTGAAAACyes 591 mmu-miR-189-prec mmu-miR-189CGGTGCCTACTGAGCTGATATCAGTTCTCATTTCACACAC yes 592 mmu-miR-190-precmmu-miR-190 CTGTGTGATATGTTTGATATATTAGGTTGTTATTTAATCC yes 593mmu-miR-191-prec mmu-miR-191 CAACGGAATCCCAAAAGCAGCTGTTGTCTCCAGAGCATTCyes idem hsa-miR- 594 191-prec mmu-miR-192-2/3-prec mmu-miR-192-2/3CTGACCTATGAATTGACAGCCAGTGCTCTCGTCTCCCCTC yes 595 mmu-miR-193-precmmu-miR-193 TGAGAGTGTCAGTTCAACTGGCCTACAAAGTCCCAGTCCT yes 596mmu-miR-194-prec mmu-miR-194 ATCGGGTGTAACAGCAACTCCATGTGGACTGTGCTCGGATyes 597 mmu-miR-195-prec mmu-miR-195TAGCAGCACAGAAATATTGGCATGGGGAAGTGAGTCTGCC yes 598 mmu-miR-196-precmmu-miR-196 GTAGGTAGTTTCATGTTGTTGGGCCTGGCTTTCTGAACAC yes 599mmu-miR-199as-prec mmu-miR-199asGAGGCTGGGACATGTACAGTAGTCTGCACATTGGTTAGGC yes 600 mmu-miR-200a-prec-#1mmu-miR-200a TAGTGTCTGATCTCTAATACTGCCTGGTAATGATGACGGC yes 601mmu-miR-200a-prec-#2 mmu-miR-200aCCGTGGCCATCTTACTGGGCAGCATTGGATAGTGTCTGAT no 602 mmu-miR-201-precmmu-miR-201 TACCTTACTCAGTAAGGCATTGTTCTTCTATATTAATAAA yes 603mmu-miR-202-prec mmu-miR-202 GATCTGGTCTAAAGAGGTATAGCGCATGGGAAGATGGAGCyes 604 mmu-miR-203-prec-#1 mmu-miR-203GGTCCAGTGGTTCTTGACAGTTCAACAGTTCTGTAGCACA no 605 mmu-miR-203-prec-#2mmu-miR-203 GTAGCACAATTGTGAAATGTTTAGGACCACTAGACCCGGC yes 606mmu-miR-204-prec mmu-miR-204 TTCCCTTTGTCATCCTATGCCTGAGAATATATGAAGGAGGyes 607 mmu-miR-205-prec mmu-miR-205GTCCTTCATTCCACCGGAGTCTGTCTTATGCCAACCAGAT yes 608 mmu-miR-206-precmmu-miR-206 TAGATATCTCAGCACTATGGAATGTAAGGAAGTGTGTGGT yes 609mmu-miR-207-prec mmu-miR-207 GCTGCGGCTTGCGCTTCTCCTGGCTCTCCTCCCTCTCCTTyes 610 mmu-miR-212-prec-#1 mmu-miR-212CTTCAGTAACAGTCTCCAGTCACGGCCACCGACGCCTGGC yes 611 mmu-miR-212-prec-#2mmu-miR-212 AGCGCGCCGGCACCTTGGCTCTAGACTGCTTACTGCCCGG no 612mmu-miR-213-prec mmu-miR-213 AACATTCATTGCTGTCGGTGGGTTGAACTGTGTGGACAAGyes idem hsa-miR- 613 213-prec-#1 mmu-miR-214-prec mmu-miR-214TGTACAGCAGGCACAGACAGGCAGTCACATGACAACCCAG yes idem hsa-miR- 614 214-precmmu-miR-215-prec mmu-miR-215 CAGGAGAATGACCTATGATTTGACAGACCGTGCAGCTGTGyes 615 mmu-miR-216-prec-#1 mmu-miR-216GAGATGTCCCTATCATTCCTCACAGTGGTCTCTGGGATTA no 616 mmu-miR-216-prec-#2mmu-miR-216 ATGGCTATGAGTTGGTTTAATCTCAGCTGGCAACTGTGAG yes 617mmu-miR-217-prec-#1 mmu-miR-217 GCAGATACTGCATCAGGAACTGACTGGATAAGACTTAATCyes 618 mmu-miR-217-prec-#2 mmu-miR-217CCCCATCAGTTCCTAATGCATTGCCTTCAGCATCTAAACA no 619 mmu-miR-218-2-prec-#1mmu-miR-218-2 GGGCTTTCCTTTGTGCTTGATCTAACCATGTGGTGGAACG yes 620mmu-miR-218-2-prec-#2 mmu-miR-218-2GTGGTGGAACGATGGAAACGGAACATGGTTCTGTCAAGCA no 621 mmu-miR-219-prec-#1mmu-miR-219 TCCTGATTGTCCAAACGCAATTCTCGAGTCTCTGGCTCCG yes 622mmu-miR-219-prec-#2 mmu-miR-219 CTCTGGCTCCGGCCGAGAGTTGCGTCTGGACGTCCCGAGCno 623 mmu-miR-221-prec-#1 mmu-miR-221CAACAGCTACATTGTCTGCTGGGTTTCAGGCTACCTGGAA yes idem hsa-miR- 624 221-precmmu-miR-221-prec-#2 mmu-miR-221 GGCATACAATGTAGATTTCTGTGTTTGTTAGGCAACAGCTno 625 mmu-miR-222-prec mmu-miR-222TTGGTAATCAGCAGCTACATCTGGCTACTGGGTCTCTGGT yes 626 mmu-miR-223-precmmu-miR-223 AGAGTGTCAGTTTGTCAAATACCCCAAGTGTGGCTCATGC yes 627mmu-miR-224- mmu-miR-224-(miR- TAAGTCACTAGTGGTTCCGTTTAGTAGATGGTCTGTGCATyes 628 precformer175-#1 175) mmu-miR-224- mmu-miR-224-(miR-TGCATTGTTTCAAAATGGTGCCCTAGTGACTACAAAGCCC no 629 precformer175-#2 175)MUSTRF — TAGACTGAAGATCTAAAGGTCCCTGGTTCGATCCCGGGTT — 630 MUSTRM4 —AATCTGAAGGTCGTGAGTTCGATCCTCACACGGGGCACCA — 631 MUSTRMI-#1 —AGCAGAGTGGCGCAGCGGAAGCGTGCTGGGCCCATAACCC — idem 632 HUMTRMI-#1MUSTRMI-#2 — CCCATAACCCAGAGGTCGATGGATCGAAACCATCCTCTGC — 633 MUSTRNAH —TGCGTTGTGGCCGCAGCAACCTCGGTTCGAATCCGAGTCA — 634 MUSTRP2 —GCTCGTTGGTCTAGGGGTATGATTCTCGCTTTGGGTGCGA — 635 MUSTRS —AGCTGTTTAGCGACAGAGTGGTTCAATTCCACCTTTCGGG — 636 MUSTRV1MN —TTCCGTAGTGTAGTGGTTATCACGCTCGCCTGACACGCGA — 637 Oligonucleotide primer —5′ biotin-AAA-AAA-AAA-AAA-(biotin)AAA-AAA-AAA-AAA- — 638 #1 NNN-NNN-NN3′ Oligonucleotide primer — 5′ biotin-(biotin)-AAA-NNN-NNN-NN 3′ — 639#2 Oligonucleotide primer —5′ GCC-AGT-GAA-TTG-TAA-TAC-GAC-TCA-CTA-TAG-GGA- — 640 #3GGC-GGN-NNN-NNN-N 3′

miRNA Microarray Fabrication. 40-mer 5′ amine modified C6oligonucleotides were resuspended in 50 mM phosphate buffer pH 8.0 at 20mM concentration. The individual oligonucleotide-probe was printed intriplicate on Amersham CodeLink™ activated slides under 45% humidity byGeneMachine OmniGrid™ 100 Microarrayer in 2×2 pin configuration and20×20 spot configuration of each subarray. The spot diameter was 100 μmand distance from center to center was 200 μm. The printed miRNAmicroarrays were further chemically covalently-coupled under 70%humidity overnight. The miRNA microarrays were ready for samplehybridization after additional blocking and washing steps.

Target Preparation. Five μg of total RNA were separately added to areaction mix in a final volume of 12 μl, containing 1 μg of[3′(N)8-(A)12-biotin-(A)12-biotin 5′] oligonucleotide primer. Themixture was incubated for 10 min at 70° C. and chilled on ice. With themixture remaining on ice, 4 μl of 5× first-strand buffer, 2 μl 0.1 MDTT, 1 μl of 10 mM dNTP mix and 1 μl Superscript™ II RNaseH⁻ reversetranscriptase (200 U/μl) was added to a final volume of 20 μl, and themixture incubated for 90 min in a 37° C. water bath. After incubationfor first strand cDNA synthesis, 3.5 μl of 0.5 M NaOH/50 mM EDTA wasadded into 20 μl of first strand reaction mix and incubated at 65° C.for 15 min to denature the RNA/DNA hybrids and degrade RNA templates.Then 5 μl of 1 M Tris-HCl, pH 7.6 (Sigma) was added to neutralize thereaction mix and labeled targets were stored in 28.5 μl at −80° C. untilchip hybridization.

Array Hybridization. Labeled targets from 5 μg of total RNA were usedfor hybridization on each KCC/TJU miRNA microarray containing 368 probesin triplicate, corresponding to 245 human and mouse miRNA genes. Allprobes on these microarrays were 40-mer oligonucleotides spotted bycontacting technologies and covalently attached to a polymeric matrix.The microarrays were hybridized in 6×SSPE/30% formamide at 25° C. for 18hours, washed in 0.75×TNT at 37° C. for 40 min, and processed usingdirect detection of the biotin-containing transcripts byStreptavidin-Alexa647 conjugate. Processed slides were scanned using aPerkin Elmer ScanArray® XL5K Scanner with the laser set to 635 nm, atPower 80 and PMT 70 setting, and a scan resolution of 10 microns.

Data Analysis. Images were quantified by QuantArray® Software(PerkinElmer). Signal intensities for each spot were calculated bysubtracting local background (based on the median intensity of the areasurrounding each spot) from total intensities. Raw data were normalizedand analyzed using the GeneSpring® software version 6.1.1 (SiliconGenetics, Redwood City, Calif.). GeneSpring generates an average valueof the three spot replicates of each miRNA. Following datatransformation (to convert any negative value to 0.01), normalizationwas performed by using a per-chip 50th percentile method that normalizeseach chip on its median allowing comparison among chips. Hierarchicalclustering for both genes and conditions were then generated by usingstandard correlation as a measure of similarity. To highlight genes thatcharacterize each tissue, a per-gene on median normalization wasperformed, which normalizes the expression of every miRNA on its medianamong samples.

Samples. HeLa cells were purchased from ATCC and grown as recommended.Mouse macrophage cell line RAW264.7 (established from BALB/c mice) wasalso used (Dumitru, C. D., Ceci, J. D., Tsatsanis, C., Kontoyiannis, D.,Stamatakis, K., Lin, J. H., Patriotis, C., Jenkins, N. A., Copeland, N.G., Kollias, G. & Tsichlis, P. N. (2000) Cell 103, 1071-83). RNA from 20normal human tissues, including 18 of adult origin (7 hematopoietic:bone marrow, lymphocytes B, T, and CD5+ cells from 2 individuals,peripheral blood leukocytes derived from three healthy donors, spleen,and thymus; and 11 solid tissues, including brain, breast, ovary,testis, prostate, lung, heart, kidney, liver, skeletal muscle, andplacenta) and 2 of fetal origin (fetal liver and fetal brain) wereassessed for miRNA expression. Each RNA was labeled and hybridized induplicate and the average expression was calculated. For all the normaltissues, except lymphocytes B, T and CD5+ cells, total RNA was purchasedfrom Ambion (Austin, Tex.).

Cell Preparation. Mononuclear cells (MNC) from peripheral blood ofnormal donors were separated by Ficoll-Hypaque density gradients. Tcells were purified from these MNC by rosetting with neuraminidasetreated SRBC and depletion of contaminant monocytes (Cd11b+), naturalkiller cells (CD16+) and B lymphocytes (CD19+) were purified usingmagnetic beads (Dynabeads, Unipath, Milano, Italy) and specificmonoclonal antibodies (Becton Dickinson, San Jose, Calif.). Total Bcells and CD5+ B cells were prepared from tonsils as described (Dono,M., Zupo, S., Leanza, N., Melioli, G., Fogli, M., Melagrana, A.,Chiorazzi, N. & Ferrarini, M. (2000) J. Immunol. 164, 5596-604).Briefly, tonsils were obtained from patients in the pediatric age groupundergoing routine tonsillectomies, after informed consent. Purified Bcells were prepared by rosetting T cells from MNC cells withneuraminidase treated SRBC. In order to obtain CD5+ B cells, purified Bcells were incubated with anti CD5 monoclonal antibody followed by goatanti mouse Ig conjugated with magnetic microbeads. CD5+ B cells werepositively selected by collecting the cells retained on the magneticcolumn MS by Mini MACS system (Miltenyi Biotec, Auburn, Calif.). Thedegree of purification of the cell preparations was higher than 95%, asassessed by flow cytometry.

RNA Extraction and Northern Blots. Total RNA isolation and blots wereperformed as described (Calin, et al., (2002) Proc Natl Acad Sc USA. 99,15524-15529). After RNA isolation, the washing step with ethanol was notperformed, or if performed, the tube walls were rinsed with 75% ethanolwithout perturbing the RNA pellet (Lagos-Quintana, et al., (2001)Science 294, 853-858). For reuse, blots were stripped by boiling in 0.1%aqueous SDS/0.1×SSC for 10 min, and were reprobed. 5S rRNA stained withethidium bromide served as a loading control.

Quantitative RT-PCR for miRNA Precursors. Quantitative RT-PCR wasperformed as described (Schmittgen, T. D., Jiang, J., Liu, Q. & Yang, L.(2004) Nucleic Acid Research 32, 43-53). Briefly, RNA was reversetranscribed to cDNA with gene-specific primers and Thermoscript, and therelative amount of each miRNA to both U6 RNA and tRNA for initiatormethionine was described using the equation 2^(−dC)T, wheredC_(T)=(C_(TmiRNA)−C_(TU6 or HUMTMI RNA)). The miRNAs analyzed includedmiR-15a, miR-16-1, miR-18, miR-20, miR-21, miR-28-2, miR-30d, miR-93-1,miR-105, miR-124a-2, miR-147, miR-216, miR-219, and miR-224. The primersused were as published (Schmittgen, T. D., Jiang, J., Liu, Q. & Yang, L.(2004) Nucleic Acid Research 32, 43-53).

Microarray Data Submission. All data were submitted using MIAMExpress toArray Express database and each of the 44 samples described herereceived an ID number ranging from SAMPLE169150SUB621 to SAMPLE169193SIUB621.

Results

Hybridization Sensitivity. The hybridization sensitivity of the miRNAmicroarray was tested using various quantities of total RNA from HeLacells, starting from 2.5 μg up to 20 μg. The coefficients of correlationbetween the 5 μg experiment versus the 2.5, 10 and 20 μg experiments,were 0.98, 0.99 and 0.97 respectively. These results clearly show highinter-assay reproducibility, even in the presence of large differencesin RNA quantities. In addition, standard deviation calculated for miRNAtriplicates was below 10% for the vast majority (>95%) ofoligonucleotides. All other experiments described here were performedwith 5 μg of total RNA.

Microarray specificity. To test the specificity of the microchip, miRNAexpression in human blood leukocytes from three healthy donors and 2samples of mouse macrophages was analyzed. Samples derived from the sametype of tissue presented homogenous patterns of miRNA expression.Furthermore, the pattern of hybridization is different for the twospecies. To confirm microarray results, the same RNA samples from mousemacrophages and HeLa cells were also analyzed by quantitative RT-PCR fora randomly selected set of 14 miRNAs (Schmittgen, T. D., Jiang, J., Liu,Q. & Yang, L. (2004) Nucleic Acid Research 32, 43-53). When we were ableto amplify a miRNA precursor for which a correspondent oligonucleotidewas present on the chip (hsa-miR-15a, hsa-mir-30d, mmu-miR-219 andmmu-miR-224) the concordance between the two techniques was 100%.Furthermore, it has been reported that expression levels of the activemiRNA and the precursor pre-miRNA are different in the same sample(Calin, et al. (2002) Proc Natl Acad Sc USA. 99, 15524-15529;Mourelatos, et al. (2002) Genes Dev 16, 720-728; Lagos-Quintana, et al.(2002) Curr Biol 12, 735-739); in fact, for another 10 miRNAs for whichonly the oligonucleotide corresponding to the active version was presenton the chip, no concordance with quantitative real-time PCR results wasobserved for the precursor.

The stringency of hybridization was, in several instances, sufficient todistinguish nucleotide mismatches for members of closely related miRNAfamilies and very similar sequences gave distinct expression profiles(for example let-7a-1 and let-7f-2 which are 89% similar in an 88nucleotide sequence). Therefore, each quantified result represents thespecific expression of a single miRNA member and not the combinedexpression of the entire family. In other cases, when a portion ofoligonucleotide was 100% identical for two probes (for example, the23mer of active molecule present in the 40-mer oligonucleotides for bothmir-16 sequences from chromosome 13 and chromosome 3), very similarprofiles were observed. Therefore, both sequence similarity andsecondary structure influence the cross-hybridization between differentmolecules on this type of microarray.

miRNA Expression in Normal Human Tissues. To further validatereliability of the microarray, we analyzed a panel of 20 RNAs from humannormal tissues, including 18 of adult origin (7 hematopoietic and 11solid tissues) and 2 of fetal origin (fetal liver and brain). For 15 ofthem, at least two different RNA samples or two replicates from the samepreparation were used (for a detailed list of samples see the aboveMethods). The results demonstrated that different tissues havedistinctive patterns of miRNome expression (defined as the fullcomplement of miRNAs in a cell) with each tissue presenting a specificsignature. Using unsupervised hierarchical clustering, the same types oftissue from different individuals clustered together. The hematopoietictissues presented two distinct clusters, the first one containing CD5+cells, T lymphocytes, and leukocytes and the second cluster containingbone marrow, fetal liver and B lymphocytes. Of note, RNA of fetal oradult type from the same tissue origin (brain) present different miRNAexpression pattern. The results demonstrated that some miRNAs are highlyexpressed in only one or few tissues, such as miR-1b-2 or miR-99b inbrain, and the closely related members miR-133a and miR-133b in skeletalmuscle, heart and prostate. The types of normalization of the GeneSpringsoftware (on 50% with or without a per-gene on median normalization) didnot influence these results.

To verify these data, Northern blot analysis was performed on total RNAused in the microarray experiments, using four miRNA probes: miR-16-1,miR-26a, miR-99a and miR-223. In each case, the concordance between thetwo techniques was high: in all instances the highest and the lowestexpression levels were concordant. For example high levels of miR-223expression were found by both techniques in spleen, for miR-16-1 in CD5+cells, while very low levels were found in brain for both miRNAs.Moreover, in several instances (for example miR-15a), we were able toidentify the same pattern of expression for the precursor and the activemiR with both microchip and Northern blots.

We also compared the published expression data for cloned human andmouse miRNAs by Northern blot analyses against the microarray results.We found that the concordance with the chip data is high for bothpattern and intensity of expression. For example, miR-133 was reportedto be strongly expressed only in the skeletal muscle and heart (Sempere,et al. (2003) Genome Biol. 5, R13), precisely as was found with themicroarray, while miR-125 and mir-128 were reported to be highlyexpressed in brain (Sempere, et al. (2003) Genome Biol. 5, R13), afinding confirmed on the microchip.

Example 11 miRNA Profiling of B-Cell Chronic Lymphocytic LeukemiaSamples

Introduction

The miRNome expression in 38 individual human B-cell chronic lymphocyticleukemia (CLL) cell samples was determined utilizing the microchip ofExample 10. One normal lymph node sample and 5 samples from healthydonors, including two tonsillar CD5+B lymphocyte samples and three bloodmononuclear cell (MNC) samples, were included for comparison. Ashereinafter demonstrated, two distinct clusters of CLL samplesassociated with the presence or the absence of Zap-70 expression, apredictor of early disease progression. Two miRNA signatures wereassociated with presence or absence of mutations in the expressedimmunoglobulin variable-region genes or with deletions at 13q14respectively.

Materials and Methods

The following methods were employed in the miRNome expression study.

Tissue Samples and CLL Samples. 47 samples were used for this study,including 41 samples from 38 patients with CLL, and 6 normal samples,including one lymph node, tonsillar CD5+ B cells from two normal donorsand blood mononuclear cells from three normal donors. For three cases,two independent samples were collected and processed. CLL samples wereobtained after informed consent from patients diagnosed with CLL at theCLL Research Consortium institutions. Briefly, blood was obtained fromCLL patients, mononuclear cells were isolated through Ficoll/Hypaquegradient centrifugation (Amersham Pharmacia Biotech) and processed forRNA extraction according to described protocols (M. Lagos-Quintana, R.Rauhut, W. Lendeckel, T. Tuschl, Science 294, 853-858 (2001)). For themajority of samples clinical and biological information, such as age atdiagnosis, sex, Rai stage, presence/absence of treatment, ZAP-70expression, IgV_(H) gene mutation status were available, as provided inTable 9:

TABLE 9 Clinical and biological data for the patients in the two CLLclusters* Semnification Dx Age Sex % Zap VH gene Mut CLL cluster 1 50.68F 30.4 VH4-04 Neg CLL cluster 1 57.4 F 50.6 VH3-33 Pos CLL cluster 167.49 M 0.5 VH3-23 Pos CLL cluster 1 59.74 M 31.5 VH3-09 Pos CLL cluster1 77.49 F 0.3 VH5-51 Pos CLL cluster 1 58.19 F 3.6 VH3-30/3-30.5 Pos CLLcluster 1 43 M 41.9 VH4-30.1/4-31 Neg CLL cluster 1 61.82 M 83.2 VH1-03Neg CLL cluster 1 48.44 F 69.3 VH1-69 Neg CLL cluster 2 72.59 M 2.2VH3-72 Pos CLL cluster 2 45.19 M 7.3 VH1-69 Pos CLL cluster 2 56.39 F0.6 VH3-15 Pos CLL cluster 2 61.85 F 0.1 VH3-30 Neg CLL cluster 2 60.89F 0.1 VH2-05 Pos CLL cluster 2 62.66 M 1 VH3-07 Pos CLL cluster 2 49.85M 3.6 VH3-74 Pos CLL cluster 2 70.62 M 0.2 VH3-13 Pos CLL cluster 268.02 F 0.9 VH3-30.3 Pos CLL cluster 2 46.84 M 62.2 VH3-30/3-30.5 NegCLL cluster 2 51.31 F 91.9 VH4-59 Neg CLL cluster 2 52.6 F 10.6 VH3-07Pos CLL cluster 2 56.04 F 0.4 VH3-72 Pos CLL cluster 2 61.67 M 77.9VH3-74 Neg CLL cluster 2 62.14 F 46 VH1-02 Pos CLL cluster 2 39.29 F10.1 VH3-07 Neg *data for ZAP-70 expression were available for 25patients (25/38, 66%).

Cell Preparation. Mononuclear cells (MNC) from peripheral blood ofnormal donors were separated by Ficoll-Hypaque density gradients. Tcells were purified from these MNC by rosetting withneuraminidase-treated sheep red blood cells (SRBC) and depletion ofcontaminant monocytes (Cd11b+), natural killer cells (CD16+) and Blymphocytes (CD19+) were purified using magnetic beads (Dynabeads,Unipath, Milano, Italy) and specific monoclonal antibodies (BectonDickinson, San Jose, Calif.). Total B cells and CD5+ B cells wereprepared from tonsillar lymphocytes as described (M. Dono et al., J.Immunol. 164, 5596-604. (2000)). Briefly, tonsils were obtained frompatients in the pediatric age group undergoing routine tonsillectomies,after informed consent. Purified B cells were prepared by rosetting Tcells from MNC cells with neuraminidase treated SRBC. In order to obtainCD5+ B cells, purified B cells were incubated with anti CD5 monoclonalantibody followed by goat anti mouse Ig conjugated with magneticmicrobeads. CD5+ B cells were positively selected by collecting thecells retained on the magnetic column MS by Mini MACS system (MiltenyiBiotec, Auburn, Calif.). The degree of purification of the cellpreparations was higher than 95%, as assessed by flow cytometry.

RNA Extraction and Northern Blots. Total RNA isolation and blots wereperformed as described (G. A. Calin et al., Proc Natl Acad Sc USA. 99,15524-15529 (2002)). After RNA isolation, the washing step with ethanolwas not performed, or if performed, the tube walls were rinsed with 75%ethanol without perturbing the RNA pellet (M. Lagos-Quintana, R. Rauhut,W. Lendeckel, T. Tuschl, Science 294, 853-858 (2001)). For reuse, blotswere stripped by boiling in 0.1% aqueous SDS/0.1×SSC for 10 minutes, andwere reprobed. 5S rRNA stained with ethidium bromide served as a sampleloading control.

Microarray Experiments. RNA blot analysis was performed as described inExample 10, utilizing the microchip of Example 10. Briefly, labeledtargets from 5 μg of total RNA was used for hybridization on each miRNAmicroarray chip containing 368 probes in triplicate, corresponding to245 human and mouse miRNA genes. The microarrays were hybridized in6×SSPE/30% formamide at 25° C. for 18 hrs, washed in 0.75×TNT at 37° C.for 40 min, and processed using a method of direct detection of thebiotin-containing transcripts by Streptavidin-Alexa647 conjugate.Processed slides were scanned using a Perkin Elmer ScanArray® XL5KScanner, with the laser set to 635 nm, at Power 80 and PMT 70 setting,and a scan resolution of 10 microns.

Data Analysis. Expression profiles were analyzed in duplicateindependent experiments starting from the same cell sample. Raw datawere normalized and analyzed in GeneSpring® software version 6.1.1(Silicon Genetics, Redwood City, Calif.). GeneSpring generated anaverage value of the three spot replicates of each miRNA. Following datatransformation (to convert any negative value to 0.01), normalizationwas performed by using a per-chip on median normalization method and anormalization to specific samples, expressly to the two CD5+ B cellsamples, used as common reference for miRNA expression. Hierarchicalclustering for both genes and conditions were generated by usingstandard correlation as a measure of similarity. To identify genes withstatistically significant differences between sample groups (i.e. CLLcells and CD5+ B cells, CLL and MNC, CLL samples with or without IgV_(H)mutations or CLL cases with or without 13q14.3 deletion), a Welch'sapproximate t-test for two groups (variances not assumed equal) with ap-value cutoff of 0.05 and Benjamini and Hochberg False Discovery Rateas multiple testing correction were performed.

Real Time PCR. Quantitative real-time PCR was performed as described byT. D. Schmittgen, J. Jiang, Q. Liu, L. Yang, Nucleic Acid Research 32,43-53 (2004). Briefly, RNA was reverse transcribed to cDNA withgene-specific primers and Thermoscript and the relative amount of eachmiRNA to tRNA for initiator methionine was described, using the equation2^(−dC)T, where dC_(T)=(C_(TmiRNA)−C_(TU6 or HUMTMI RNA)). The set ofanalyzed miRNAs included miR-15a, miR-16-1, miR-18, miR-20, and miR-21.The primers used were as published (Id.).

Western Blotting. Protein lysates were prepared from the leukemia cellsof 7 CLL patients and from isolated tonsillar CD5⁺ B cells. Western blotanalysis was performed with a polyclonal Pten antibody (Cell SignalingTechnology, Beverly, Mass.) and was normalized using an anti-actinantibody (Sigma, St. Louis, Mo.).

Microarray Data Submission. All data were submitted using MIAMExpress tothe Array Express database and each of the 39 CLL samples described herereceived an ID number ranging from SAMPLE 169194SUB621 to SAMPLE169234SIUB621.

Results

Comparison of miRNA expression in CLL cells vs. normal CD5+ B cells andnormal blood mononuclear cells. Normal CD5+ B cells utilized in thisstudy are considered as normal cell counterparts to CLL B cells. Asdescribed in Table 10, two groups of differentially expressed miRNAs,the first composed of 55 genes and the second of 29 genes, hadstatistically significant differences in expression levels between thevarious groups (p<0.05 using Welch t-test as described in Materials andMethods, above). Only 6 miRNA are shared between the two lists,confirming the results of Example 10 showing distinct miRNome signaturesin CD5⁺ B cells and leukocytes. When both pre-miRNA and mature miRNAwere observed to be dysregulated (such as for miR-123, miR-132 ormiR-136), the same type of variation in CLL samples with respect to CD5or MNC was noted in every case. Also, for some miRNA genomic clustersall members were aberrantly regulated (such as the up-regulated 7q32group encompassing miR-96-miR182-miR183), while for others only somemembers were abnormally expressed (such as the 13q31 genomic clusterwhere two out of six members, miR-19 and miR-92-1, were stronglyup-regulated and two, miR-17 and miR-20, were moderatelydown-regulated). Without wishing to be bound by any theory, the resultsillustrate the complexity of the patterns of miRNA expression in CLL andindicate the existence of mechanisms regulating individual miRNA genesthat map in the same chromosome region. In confirmation of the accuracyof the data, miR-223, reported to be expressed at high levels ingranulocytes (M. Lagos-Quintana et al., Curr Biol 12, 735-739 (2002)),was expressed at significantly lower levels in the CLL samples than inthe MNC, but at about the same level as that noted for CD5⁺ B cells(which generally constitute less than a few percent of blood MNC).

TABLE 10 Differentially expressed miRNAs in CLLs versus CD5+ cells orCLLs versus MNC (bold)* Oligonucleotide probe microRNA Chr location FRAassociated P-value Type hsa-let-7a-2-precNo1 let-7a-2 11q24.1 0.014 Downhsa-let-7d-v2-precNo2 let-7d-v2-prec 12q14.1 4.29E−04 Downhsa-let-7f-1-precNo1 let-7f-1 09q22.2 FRA9D 3.09E−29 Downhsa-mir-009-2No1 miR-9-2 5q14 0.013 up hsa-mir-010a-precNo2 miR-10a-prec17q21.3 0.007 up hsa-mir-010b-precNo1 mir-10b 02q31 1.10E−15 uphsa-mir-015b-precNo2 mir-15b-prec 03q26.1 5.79E−14 uphsa-mir-017-precNo2 mir-17-prec 13q31 0.042 Down hsa-mir-017-precNo2mir-17-prec 13q31 0.049 Down hsa-mir-019a-prec mir-19a 13q31 5.16E−17 uphsa-mir-020-prec mir-20a 13q31 0.038 Down hsa-mir-021-prec-17No2mir-21-prec 17q23.2 FRA17B 0.044 up hsa-mir-022-prec mir-22 17p13.37.16E−04 up hsa-mir-023a-prec mir-23a 19p13.2 0.011 Downhsa-mir-024-1-precNo1 mir-24-1 09q22.1 FRA9D 0.002 Downhsa-mir-024-1-precNo2 mir-24-1-prec 09q22.1 FRA9D 7.35E−20 uphsa-mir-024-2-prec mir-24-2 19p13.2 5.69E−17 Down hsa-mir-025-precmir-25 07q22 FRA7F 9.52E−04 Down hsa-mir-027b-prec mir-27b 09q22.1 FRA9D0.046 Down hsa-mir-029a-2No1 mir-29a-2 07q32 FRA7H 0.013 uphsa-mir-029a-2No2 mir-29a-2-prec 07q32 FRA7H 0.001 up hsa-mir-029c-precmir-29c 01q32.2-32.3 0.002 up hsa-mir-030a-precNo1 mir-30a 06q12-130.004 Down hsa-mir-030a-precNo2 mir-30a-prec 06q12-13 0.034 Downhsa-mir-030d-precNo2 mir-30d-prec 08q24.2 0.008 Down hsa-mir-033-precmir-33 22q13.2 1.56E−18 up hsa-mir-034precNo1 mir-34 01p36.22 6.00E−06up hsa-mir-092-prec-13=092-1No1 mir-92-1 13q31 1.70E−12 uphsa-mir-092-prec-13=092-1No2 mir-92-prec 13q31 0.021 Downhsa-mir-092-prec-X=092-2 mir-92-2 Xq26.2 3.38E−04 Downhsa-mir-092-prec-X=092-2 mir-92-2 Xq26.2 0.042 Downhsa-mir-096-prec-7No1 mir-96 07q32 FRA7H 1.79E−04 up hsa-mir-099-prec-21mir-99 21q11.2 0.001 Down hsa-mir-101-1/2-precNo1 mir-101 01p31.3 FRA1C1.26E−08 up hsa-mir-101-1/2-precNo2 mir-101-prec 01p31.3 0.017 uphsa-mir-103-prec-5=103-1 mir-103-1 05q35.1 0.002 Downhsa-mir-103-prec-5=103-1 mir-103-1 05q35.1 0.007 Downhsa-mir-105-prec-X.1=105-1 mir-105-1 Xq28 FRAXF 1.55E−05 uphsa-mir-107-prec-10 mir-107 10q23.31 0.002 Down hsa-mir-123-precNo1mir-123 09q34 2.80E−16 up hsa-mir-123-precNo1 mir-123 09q34 0.021 Downhsa-mir-123-precNo2 mir-123-prec 09q34 0.021 Down hsa-mir-124a-2-precmir-124a-2 08q12.2 4.33E−06 up hsa-mir-128b-precNo1 mir-128b 03p225.05E−07 Down hsa-mir-128b-precNo2 mir-128-prec 03p22 0.007 uphsa-mir-130a-precNo2 mir-130a-prec 11q12 0.010 Down hsa-mir-130a-precNo2mir-130a-prec 11q12 0.050 up hsa-mir-132-precNo1 mir-132 11q12 1.68E−07up hsa-mir-132-precNo2 mir-132-prec 17p13.3 8.62E−04 uphsa-mir-134-precNo1 mir-134 14q32 6.01E−08 up hsa-mir-136-precNo1mir-136 14q32 0.003 up hsa-mir-136-precNo2 mir-136-prec 14q32 7.44E−04up hsa-mir-137-prec mir-137 01p21-22 0.013 up hsa-mir-138-1-precmir-138-1 03p21 2.53E−04 up hsa-mir-140No1 mir-140 16q22.1 2.41E−16 uphsa-mir-141-precNo1 mir-141 12p13 7.91E−08 up hsa-mir-141-precNo2mir-141-prec 12p13 1.39E−08 up hsa-mir-142-prec mir-142 17q23 FRA17B0.004 Down hsa-mir-145-prec mir-145 05q32-33 0.021 Down hsa-mir-146-precmir-146 05q34 1.03E−08 Down hsa-mir-148-prec mir-148 07p15 3.48E−05 uphsa-mir-152-precNo1 mir-152 17q21 0.003 up hsa-mir-152-precNo2mir-152-prec 17q21 3.35E−05 up hsa-mir-153-1-prec1 mir-153 02q36 0.005up hsa-mir-153-1-prec2 mir-153-prec 02q36 1.48E−08 uphsa-mir-154-prec1No1 mir-154 14q32 1.14E−10 up hsa-mir-155-prec mir-15521q21 0.029 up hsa-mir-181b-precNo2 mir-181b-prec 01q31.2-q32.1 3.26E−06up hsa-mir-181c-precNo2 mir-181c-prec 19p13.3 0.003 uphsa-mir-182-precNo2 mir-182-prec 07q32 FRA7H 0.001 uphsa-mir-183-precNo2 mir-183-prec 07q32 FRA7H 1.26E−23 uphsa-mir-184-precNo1 mir-184 15q24 0.007 up hsa-mir-188-prec mir-188Xp11.23-p11.2 6.08E−11 up hsa-mir-190-prec mir-190 15q21 FRA15A 1.48E−20up hsa-mir-191-prec mir-191 03p21 9.14E−05 Down hsa-mir-192-2/3No1mir-192 11q13 2.00E−07 Down hsa-mir-193-precNo2 mir-193-prec 17q11.29.14E−05 up hsa-mir-194-precNo1 mir-194 01q41 FRA1H 0.002 uphsa-mir-196-2-precNo1 mir-196-2 12q13 FRA12A 4.94E−08 uphsa-mir-196-2-precNo2 mir-196-2-prec 12q13 FRA12A 0.040 uphsa-mir-197-prec mir-197 01p13 0.003 Down hsa-mir-200a-prec mir-200a01p36.3 9.14E−05 up hsa-mir-204-precNo2 mir-204-prec 09q21.1 8.55E−04 uphsa-mir-206-precNo1 mir-206 06p12 0.003 Down hsa-mir-210-prec mir-21011p15 0.009 Down hsa-mir-212-precNo1 mir-212 17p13.3 0.045 Downhsa-mir-213-precNo1 mir-213 01q31.3-q32.1 1.47E−33 Downhsa-mir-217-precNo2 mir-217 02p16 3.85E−09 up hsa-mir-220-prec mir-220Xq25 2.14E−09 Down hsa-mir-220-prec mir-220 Xq25 3.16E−05 Downhsa-mir-221-prec mir-221 Xp11.3 1.39E−05 Down hsa-mir-223-prec mir-223Xq12-13.3 9.04E−04 Down * The correlation with fragile sites (FRA)location is as published in Calin et al., Proc Natl Acad Sci USA. 101,2999-3004 (2004).

As indicated in the CLL vs. CD5+ B cell list of Table 10, several miRNAslocated exactly inside fragile sites (miR-183 at FRA7H, miR-190 atFRA15A and miR-24-1 at FRA9D) and miR-213. The mature miR-213 moleculeis expressed at lower levels in all the CLL samples, and the precursormiR-213 is reduced in expression in 62.5% of the samples. miR-16-1, at13q14.3, which we previously reported to be down-regulated in themajority of CLL cases by microarray analysis (G. A. Calin et al., ProcNatl Acad Sc USA. 99, 15524-15529 (2002), was expressed at low levels in45% of CLL samples. An identical mature miR-16 exists on chromosome 3;because the 40-mer oligonucleotide for both miR-16 sequences fromchromosome 13 (miR-16-1) and chromosome 3 (miR-16-2) exhibit the same23-mer mature sequence, very similar profiles were observed. However,since we observed very low levels of miR-16-2 expression in CLL samplesby Northern blot, the expression observed is mainly contributed bymiR-16-1. The other miRNA of 13q14.3, miR-15a, was expressed at lowlevels in ˜25% of CLL cases. Overall, these data demonstrate that CLL isa malignancy with extensive alterations of miRNA expression.

Validation of the microarray data was supplied for four miRNAs byNorthern blot analyses: miR-16-1, located within the region of deletionat 13q14.3, miR-26a, on chromosome 3 in a region not involved in thepathogeneses of CLL, and miR-206 and miR-223 that are down-regulated(see above) in the majority of samples. For all four miRNAs, theNorthern blot analyses confirmed the data obtained using the microarray.We also performed real-time PCR to measure expression levels ofprecursor molecules for five genes (miR-15a, miR-16-1, miR-18, miR-21,and miR-30d) and we found results concordant with the chip data.

Unsupervised hierarchical clustering generated two clearlydistinguishable miRNA signatures within the set of CLL samples, onecloser to the miRNA expression profile observed in human leukocytes andthe other clearly different (FIG. 3). A list of the microRNAsdifferentially expressed between the two main CLL clusters is given inTable 11. The name of each miRNA is as in the miRNA Registry. Thedisregulation of either active molecule or precursor is specified in thename. The location in minimally deleted or minimally amplified orbreakpoint regions or in fragile sites is presented. The top 25differentially expressed miRNA in these two signatures (at p<0.001)include genes known or suggested to be involved in cancer. The precursorof miR-155 is over-expressed in the majority of childhood Burkitt'slymphoma (M. Metzler, M. Wilda, K. Busch, S. Viehmann, A. Borkhardt,Genes Chromosomes Cancer. 39, 167-9. (2004)), miR-21 is located at thefragile site FRA17B (G. A. Calin et al., Proc Natl Acad Sci US A. 101,2999-3004. (2004)), miR-26a is at 3p21.3, a region frequently deletedregion in epithelial cancers, while miR-92-1 and miR-17 are at 13q32, aregion amplified in follicular lymphoma (Id.).

TABLE 11 microRNAs differentially expressed between the two main CLLclusters*. Oligonucleotide miRNA Chr location P-value Cancer-associatedgenomic regions hsa-miR-017-precNo2 miR-17-prec 13q31 0.00000000 Amp -Folicular Ly/Del - HCC hsa-miR-020-prec miR-20 13q31 0.00000000 Amp -Folicular Ly/Del - HCC hsa-miR-103-2-prec miR-103-2 20p13 0.00000001hsa-miR-030d-precNo2 miR-30d-prec 08q24.2 0.00000002 hsa-miR-106-prec-XmiR-106 Xq26.2 0.00000006 Del - advanced ovarian ca. hsa-miR-026b-precmiR-26b 02q35 0.00000006 hsa-miR-103-prec-5 = 103-1 miR-103-1 05q35.10.00000006 hsa-miR-025-prec miR-25 07q22 0.00000007 FRA7Fhsa-miR-030a-precNo1 miR-30a 06q12-13 0.00000008 hsa-miR-021-prec-17No1miR-21 17q23.2 0.00000008 Amp - Neuroblastoma; FRA17Bhsa-miR-107-prec-10 miR-107 10q23.31 0.00000008 hsa-miR-092-prec-13 =092-1No2 miR-92-1-prec 13q31 0.00000024 Amp - Follicular Ly.hsa-miR-027a-prec miR-27a 19p13.2 0.00000024 hsa-miR-023a-prec miR-23a19p13.2 0.00000032 hsa-miR-092-prec-X = 092-2 miR-92-2 Xq26.2 0.00000040Del - Advanced Ovarian ca. hsa-miR-030b-precNo1 miR-30b 08q24.2 0.000004hsa-miR-026a-precNo1 miR-26a 03p21 0.000009 Del - Epithelialmalignancies hsa-miR-093-prec-7.1 = 093-1 miR-93-1 07q22 0.000009 Amp -Folicular Ly/Del - HCC; FRA7F hsa-miR-194-precNo1 miR-194 01q41 0.000015FRA1H hsa-miR-155-prec miR-155 21q21 0.000028 Amp - Colon ca; ChildhoodBurkit Ly hsa-miR-153-2-prec miR-153-2 07q36 0.000028 t(7; 12)(q36;p13) - Acute Myeloid Leukemia hsa-miR-193-precNo2 miR-193-prec 17q11.20.000044 Del - Ovarian ca. hsa-miR-130a-precNo1 miR-130a 11q12 0.0001hsa-miR-023b-prec miR-23b 09q22.1 0.0001 Del - Urothelial Ca.; FRA9Dhsa-miR-030c-prec miR-30c 06q13 0.0001 hsa-miR-139-prec miR-139 11q130.0001 hsa-miR-144-precNo2 miR-144-prec 17q11.2 0.0001 Amp - PrimaryBreast ca. hsa-miR-29b-2 = 102prec7.1 = 7.2 miR-29b-2 07q32 0.0002 Del -Prostate ca agressiveness; FRA7H hsa-miR-125a-precNo2 miR-125a-prec19q13.4 0.0002 hsa-miR-224-prec miR-224 Xq28 0.0002 hsa-miR-211-precNo1miR-211 Xp11.3 0.0002 Del - Malignant Mesothelioma. hsa-miR-221-precmiR-221 Xp11.3 0.0002 hsa-miR-191-prec miR-191 03p21 0.0002hsa-miR-018-prec miR-18 13q31 0.0003 Amp - Follicular Lymphomahsa-miR-203-precNo2 miR-203-prec 14q32.33 0.0004 Del - Nasopharyngealca. hsa-miR-217-precNo2 miR-217-prec 02p16 0.0004 hsa-miR-204-precNo2miR-204-prec 09q21.1 0.0004 hsa-miR-199a-1-prec miR-199a-1 19p13.20.0005 hsa-miR-128b-precNo1 miR-128b 03p22 0.0005 hsa-miR-102-prec-1miR-102 01q32.2-32.3 0.0005 Del - Prostate ca agressivenesshsa-miR-140No2 miR-140-prec 16q22.1 0.0006 hsa-miR-199a-2-precmiR-199a-2 01q23.3 0.0007 hsa-miR-010b-precNo2 miR-10b-prec 02q31 0.0008hsa-miR-029a-2No1 miR-29a-2 07q32 0.0008 Del - Prostate caagressiveness; FRA7H hsa-miR-125a-precNo1 miR-125a 19q13.4 0.0010hsa-miR-204-precNo1 miR-204 09q21.1 0.0011 hsa-miR-181a-precNo1 miR-181a09q33.1-34.13 0.0014 Del - Bladder ca hsa-miR-188-prec miR-188Xp11.23-p11.2 0.0014 hsa-miR-200a-prec miR-200a 01p36.3 0.0014hsa-miR-024-2-prec miR-24-2 19p13.2 0.0014 hsa-miR-134-precNo2miR-134-prec 14q32 0.0016 Del - Nasopharyngeal ca. hsa-miR-010a-precNo2miR-10a-prec 17q21.3 0.0018 hsa-miR-029c-prec miR-29c 01q32.2-32.30.0021 hsa-miR-010a-precNo1 miR-10a 17q21.3 0.0022 hsa-let-7d-v2-precNo1let-7d-v2 12q14.1 0.0022 Del - Urothelial carc; FRA9D hsa-miR-205-precmiR-205 01q32.2 0.0023 hsa-miR-129-precNo1 miR-129 07q32 0.0023 Del -Prostate ca agressiveness hsa-miR-032-precNo2 miR-32-prec 09q31.2 0.0026Del - Lung ca.; FRA9E hsa-miR-187-precNo2 miR-187-prec 18q12.1 0.0035hsa-miR-125b-2-precNo1 miR-125b-2 21q11.2 0.0036 Del - Lung ca.(MA17)hsa-miR-181c-precNo1 miR-181c 19p13.3 0.0036 hsa-miR-132-precNo2miR-132-prec 17p13.3 0.0036 Del - HCC hsa-miR-215-precNo1 miR-215 01q410.0036 FRA1H hsa-miR-136-precNo1 miR-136 14q32 0.0036 Del -Nasopharyngeal ca. hsa-miR-030a-precNo2 miR-30a-prec 06q12-13 0.0040hsa-miR-100-1/2-prec miR-100 11q24.1 0.0040 Del - 0varian Ca.; FRA11Bhsa-miR-218-2-precNo1 miR-218-2 05q35.1 0.0040 hsa-miR-193-precNo1miR-193 17q11.2 0.0052 Del - Ovarian ca. hsa-miR-027b-prec miR-27b09q22.1 0.0058 Del - Bladder ca; FRA9D hsa-miR-220-prec miR-220 Xq250.0065 hsa-miR-024-1-precNo1 miR-24-1 09q22.1 0.0065 Del - Urothelialca. hsa-miR-019a-prec miR-19a 13q31 0.0071 Amp - Follicular Lyhsa-miR-196-2-precNo1 miR-196-2 12q13 0.0082 FRA12A hsa-miR-022-precmiR-22 17p13.3 0.0086 Del - HCC hsa-miR-183-precNo2 miR-183-prec 07q320.0086 Del - Prostate ca agressiveness; FRA7H hsa-miR-128a-precNo2miR-128a-prec 02q21 0.0105 Del - Gastric Ca hsa-miR-203-precNo1 miR-20314q32.33 0.0109 Del - Nasopharyngeal ca. hsa-miR-033b-prec miR-33b17p11.2 0.0109 Amp - Breast ca. hsa-miR-030d-precNo1 miR-30d 08q24.20.0111 hsa-miR-133a-1 miR-133a-1 18q11.1 0.0119 hsa-miR-007-3-precNo2miR-7-3-prec 22q13.3 0.0128 hsa-miR-021-prec-17No2 miR-21-prec 17q23.20.0131 Amp - Neuroblastoma hsa-miR-208-prec miR-208 14q11.2 0.0134 Del -Malignant Mesothelioma hsa-miR-154-prec1No2 miR-154-prec 14q32 0.0146Del - Nasopharyngeal ca. hsa-miR-141-precNo2 miR-141-prec 12p13 0.0154hsa-miR-024-1-precNo2 miR-024-1-prec 09q22.1 0.0169 Del - Urothelialcarc; FRA9D hsa-miR-128a-precNo1 miR-128a 02q21 0.0170 Del - Gastric Cahsa-miR-184-precNo2 miR-184-prec 15q24 0.0219 hsa-miR-019b-2-precmiR-19b-2 13q31 0.0302 hsa-miR-132-precNo1 miR-132 17p13.3 0.0303 Del -Hepatocellular ca. (HCC) hsa-miR-127-prec miR-127 14q32 0.0326 Del -Nasopharyngeal ca. hsa-miR-202-prec miR-202 10q26.3 0.0333hsa-let-7g-precNo2 let-7g-prec 03p21.3 0.0350 Del - Lung Ca., Breast Ca.hsa-miR-222-precNo1 miR-222 Xp11.3 0.0351 hsa-miR-009-1No2miR-009-1-prec 05q14 0.0382 hsa-miR-136-precNo2 miR-136-prec 14q320.0391 Del - Nasopharyngeal ca. hsa-miR-010b-precNo1 miR-10b 02q310.0403 hsa-miR-223-prec miR-223 Xq12-13.3 0.0407 *The location inminimally deleted or minimally amplified or breakpoint regions or infragile sites is presented. HCC—Hepatocellular ca.; AML—acute myeloidleukemia.

The two clusters may be distinguished by at least one clinico-biologicalfactor. A high difference in the levels of ZAP-70 characterized the twogroups: 66% (6/9) patients from the first cluster vs. 25% (4/16)patients from the second one have low levels of ZAP-70 (<20%) (P=0.04 atchi test) (Table 9). The mean value of ZAP-70 was 19% (±31% S.D.) vs.35% (±30% S.D.), respectively or otherwise the two clusters candiscriminate between patients who express and who do not express thisprotein (at levels<20% ZAP-70 is considered as non-expressed) (Table 9).ZAP-70 is a tyrosine kinase, which is a strong predictor of earlydisease progression, and low levels of expression are proved to be afinding associated with good prognosis (J. A. Orchard et al., Lancet363, 105-11 (2004)).

The microarray data revealed specific molecular signatures predictivefor subsets of CLL that differ in clinical behavior. CLL cases harbordeletions at chromosome 13q14.3 in approximately 50% of cases (F.Bullrich, C. M. Croce, Chronic Lymphoid leukemia. B. D. Chenson, Ed.(Dekker, New York, 2001)). As a single cytogenetic defect, these CLLpatients have a relatively good prognosis, compared with patients withleukemia cells harboring complex cytogenetic changes (H. Dohner et al.,N Engl J Med. 343, 1910-6. (2000)). It was also shown that deletion at13q14.3 was associated with the presence of mutated immunoglobulin V_(H)(IgV_(H)) genes (D. G. Oscier et al., Blood. 100, 1177-84 (2002)),another good prognostic factor. By comparing expression data of CLLsamples with or without deletions at 13q14, we found that miR-16-1 wasexpressed at low levels in leukemias harboring deletions at 13q14(p=0.03, ANOVA test). We also found that miR-24-2, miR-195, miR-203,miR-220 and miR-221 are expressed at significantly reduced levels, whilemiR-7-1, miR-19a, miR-136, miR-154, miR-217 and the precursor ofmiR-218-2 are expressed at significantly higher levels in the sampleswith 13q14.3 deletions, respectively (Table 12). All these genes arelocated in different regions of the genome and differ in theirnucleotide sequences, excluding the possibility of cross-hybridization.Without wishing to be bound by any theory, these results suggest theexistence of functional miRNA networks in which hierarchical regulationmay be present, with some miRNA (such as miR-16-1) controlling orinfluencing the expression of other miRNA

TABLE 12 microRNAs signatures associated with prognosis in B-CLL¹. Chr.miRNA location P-value Association Observation miR-7-1 9q21.33 0.03013q14 normal miR-16-1 13q14.3 0.030 IGVH mutations negative 0.023 13q14deleted miR-19a 13q31 0.024 13q14 normal miR-24-2 19p13.2 0.033 13q14deleted miR-29c 1q32.2-32.3 0.018 IGVH mutations positive clustermiR-29c-miR 102 miR-102 1q32.2-32.3 0.023 IGVH mutations positivecluster miR-29c-miR 102 miR-132 17p13.3 0.033 IGVH mutations negativemiR-136 14q32 0.045 13q14 normal miR-154 14q32 0.020 13q14 normalmiR-186 1p31 0.038 IGVH mutations negative mir-195 17q13 0.036 13q14deleted miR-203 14q32.33 0.026 13q14 deleted miR-217-prec 2p16 0.00513q14 normal miR-218-2 5q35.1 0.019 13q14 normal miR-220 Xq25 0.02613q14 deleted miR-221 Xp11.3 0.021 13q14 deleted ¹The name of each miRNAis as in miRNA Registry and the disregulation of either active moleculeor precursor is specified in the name.

The expression of mutated IgV_(H) is a favorable prognostic marker (D.G. Oscier et al., Blood. 100, 1177-84 (2002)). We found a distinct miRNAsignature composed of 5 differentially expressed genes (miR-186,miR-132, miR-16-1, miR-102 and miR-29c) that distinguished CLL samplesthat expressed mutated IgV_(H) gene from those that expressed unmutatedIgV_(H) genes, indicating that miRNA expression profiles have prognosticsignificance in CLL. As a confirmation of our results is the observationthat the common element between the del 13q14.3-related and theIgV_(H)-related signatures is miR-16-1. This gene is located in thecommon deleted region 13q14.3 and the presence of this particulardeletion is associated with good prognosis. Therefore, miRNAs expand thespectrum of adverse prognostic markers in CLL, such as expression ofZAP-70, unmutated IgV_(H), CD38, deletion at chromosome 11q23, or lossor mutation of TP53.

Example 12 Identification of miRNA Signature Profiles Associated withPrognostic Factors and Disease Survival in B-Cell Chronic LeukemiaSamples

Introduction

Knowing that the expression profile of miRNome, the full complement ofmicroRNAs in a cell, is different between malignant CLL cells and normalcorresponding cells, we asked whether microarray analysis using themiRNACHIP could reveal specific molecular signatures predictive forsubsets of CLL that differ in clinical behavior. The miRNome expressionin 94 CLL samples was determined utilizing the microchip of Example 10.miRNA expression profiles were analyzed to determine if distinctmolecular signatures are associated with the presence or absence of twoprognostic markers, ZAP-70 expression and mutation of the IgV_(H) gene.The microarray data revealed that two specific molecular signatures wereassociated with the presence or absence of each of these markers. Ananalysis of expression profiles from Zap-70 positive/IgV_(H) unmutated(Umut) vs. Zap-70 negative/IgV_(H) mutated (Mut) CLL samples revealed aunique signature of 17 genes that can distinguish these two subsets. Ourresults indicate that miRNA expression profiles have prognosticsignificance in CLL.

Materials and Methods

Patient Samples and Clinical Database. 94 CLL samples were used for thisstudy, which were obtained after informed consent from patientsdiagnosed with CLL at the CLL Research Consortium institutions (L. Z.Rassenti et al. N. Engl. J. Med. 351(9):893-901 (2004)). Briefly, bloodwas obtained from CLL patients and mononuclear cells were isolatedthrough Ficoll/Hypaque gradient centrifugation (Amersham PharmaciaBiotech) and processed for RNA extraction according to describedprotocols (G. A. Calin et al., Proc. Natl. Acad. Sc. U.S.A. 99,15524-15529 (2002)). For each sample, clinical and biologicalinformation, such as sex, age at diagnosis, Rai stage, presence/absenceof treatment, time between diagnosis and therapy, ZAP-70 expression, andIgV_(H) gene mutation status, were available and are described in Table13.

TABLE 13 Characteristics of patients analyzed with the miRNACHIP.Characteristic Value Male sex - no. of patients (%) 58 (61.7) Age atdiagnosis - years median 57.3 range 38.2 Therapy begun No No. ofpatients 53 Time since diagnosis - months 87.07 Yes No. of patients 41Time between diagnosis & therapy - months 40.27 ZAP-70 level ≦20%48 >20% 46 IgV_(H) Unmutated (≧98% homology) 57 Mutated (<98% homology)37

RNA Extraction and Northern Blots. Total RNA isolation and RNA blottingwere performed as described (G. A. Calin et al., Proc Proc. Natl. Acad.Sc. U.S.A. 99, 15524-15529 (2002)).

Microarray Experiments. Microarray experiments were performed asdescribed in Example 11. Of note, for 76 microRNAs on the miRNACHIP, twospecific oligonucleotides were synthesized—one identifying the active 22nucleotide part of the molecule and the other identifying the 60-110nucleotide precursor. All probes on these microarrays are 40-meroligonucleotides spotted by contacting technologies and covalentlyattached to a polymeric matrix.

Data Analysis. After construction of the expression table withGenespring, data normalization was performed by using Bioconductorpackage. Analyses were carried out using the PAM package (PredictionAnalysis of Microarrays) and SAM (Significance Analysis of Microarrays)software. The data were confirmed by Northern blotting for 4 microRNAsin 20 CLL samples, each. All data were submitted using MIAMExpress tothe Array Express database.

Analysis of ZAP-70 and Sequence analysis of expressed IgV_(H). Analyseswere performed as described previously (L. Z. Rassenti et al. N. Engl.J. Med. 351(9):893-901 (2004)). Briefly, ZAP-70 expression was assessedby immunoblot analysis and flow cytometry, while the analysis ofexpressed IgV_(H) was performed by direct sequencing.

Results

Comparison of miRNA expression in ZAP-70 positive vs. ZAP-70 negativeCLL cells. Using 20% as a cutoff for defining ZAP-70 positivity, weconstructed two classes that were constituted of 48 ZAP-70-negative and46 ZAP-70-positive CLL samples, respectively. The analyses carried outusing the PAM package identified an expression signature composed of 14microRNAs (14/190 miRNAs on chip, 7.35%) with a PAM score >±0.02 (Table14). Using the expression of these microRNAs, it is possible to predictwith a low misclassification error (about 0.2 at cross-validation) thetype of ZAP-70 expression in a patient's malignant B cells.

Comparison of miRNA expression in IgV_(H) positive vs. IgV_(H) negativeCLL cells. The expression of a mutated IgV_(H) gene is a favorableprognostic marker (D. G. Oscier et al., Blood. 100, 1177-84 (2002)).ZAP-70 expression is well correlated with the status of the IgV_(H)gene. Therefore, we asked whether a specific microRNA signature canpredict the mutated (Mut) vs. unmutated (Umut) status of this gene.Using the 98% cutoff for homology with germ-line IgV_(H), we identifiedtwo groups of patients composed of 37 Umut 98% homology) and 57 Mut(<98% homology). Based on this analysis, 12 microRNAs can be used tocorrectly predict the Umut vs. Mut status of the gene with a low error(0.02) (Table 14). All of these genes are included in the previoussignature.

Comparison of miRNA expression in Zap-70 positive/IgV_(H) Umut vs.Zap-70 negative/IgV_(H) Mut CLL cells. We divided the 94 CLL cases into4 groups (Zap-70 positive/IgV_(H) Umut, Zap-70 positive/IgV_(H) Mut,Zap-70 negative/IgV_(H) Umut and Zap-70 negative/IgV_(H) Mut), and havefound, using the PAM package, that the same unique signature composed of17 genes can discriminate between the two main groups of patients, Zap70positive/IgV_(H) Umut and Zap-70 negative/IgV_(H) Mut. In this case, weobserved the lowest classification error (0.015 at cross validation).Only one patient was Zap-70 negative and IgV_(H) Umut, and therefore wasnot used in the classification. When the remaining three classes wereanalyzed, the 10 patients belonging to the Zap-70 positive/IgV_(H) Mutclass were always misclassified, which indicates that there are nomicroRNAs on the miRNACHIP that can compose a different signature. Thesame unique signature was identified using another algorithm ofmicroarray analysis, SAM, thereby confirming the reproducibility of ourresults. These results indicate that miRNA expression profiles haveprognostic significance in CLL and can be used for diagnosing thedisease state of a particular cancer by determining whether or not agiven profile is characteristic of a cancer associated with one or moreadverse prognostic markers.

TABLE 14 A miRNA signature associated with prediction factors anddisease survival in CLL patients. Short vs. IgV_(H) Mut Zap70+/IgV_(H)Long ZAP-70+ vs. Umut vs. time to Signature vs. IgV_(H) Zap70−/IgV_(H)initial component Zap-70− Umut Mut therapy Observation mir-015a −0.0728NA −0.0372 vs. NA cluster 15a/16-1 vs. 0.076 0.0485 del CLL, prostateCa. 13q13.4 (G. A. Calin et al., Proc. Natl. Acad. Sic. USA. 99,15524-15529 (2002)) mir-016-1 −0.1396 −0.0852 −0.1444 vs. NA del CLL,prostate vs. vs. 0.1312 0.1886 ca. 13q13.4 (G. A. Calin 0.1457 et al.,Proc. Natl. Acad. Sci. USA. 99, 15524-15529 (2002)) mir-016-2 −0.1615−0.0969 −0.1619 vs. NA identical 16-1/16-2 vs. vs. 0.1493 0.2113 0.1685mir-023a −0.0235 0.0647 vs. −0.0748 vs. 0.0587 cluster 23a/24-2 vs.0.0997 0.0977 vs. −0.019 0.0245 mir-023b −0.0658 −0.0663 −0.0909 vs.0.0643 cluster 24-1/23b vs. vs. 0.1021 0.1187 vs. −0.0208 FRA 9D; del0.0686 Urothelial ca. 9q22. (G. A Calin et al. Proc. Natl. Acad. Sci.U.S.A. 101(32): 11755-60 (2004)) mir-024-1 NA −0.042 vs. −0.0427 vs. NAFRA 9D; del 0.0648 0.0558 Urothelial ca. 9q22 (ref (G. A. Calin et al.Proc. Natl. Acad. Sci. U.S.A. 101(32): 11755-60 (2004)) mir-024-2 NA NA−0.0272 vs. −0.0355 0.0696 vs. −0.0225 mir-029a 0.0806 0.0887 vs. 0.1139vs. −0.1487 NA cluster 29a/29b-1 vs. −00842 −0.1367 FRA7H; del Prostateca. 7q32 (G. A. Calin et al. Proc. Natl. Acad. Sci. U.S.A. 101(32):11755-60 (2004)) mir-29b-2 0.1284 0.1869 vs. 0.2065 vs. −0.2696 NA1q32.2-32.3 vs. −0.134 −0.2879 mir-029c 0.1579 0.1846 0.2174 vs. −0.2839−0.0221 vs. −0.1648 vs. −0.2844 vs. 0.0072 mir-146 −0.1518 −0.1167−0.1803 vs. 0.07 vs. vs. 0.1798 0.2354 vs. −0.0227 0.1584 mir-155−.0.1015 −0.0743 −0.1155 vs. 0.1409 amp child Burkitt's vs. vs. 0.11450.1508 vs. −0.0456 lymphoma, colon 0.1059 ca. (M. Metzler et al. GenesChromosomes Cancer. Feb; 39(2): 167-9 (2004)) and (M. Z. Michael et al.Mol Cancer Res. 1(12): 882-91 (2003)). mir-181a −0.0473 NA −0.0279 vs.0.1862 Up-regulated in vs. 0.0364 vs. −0.0603 differentiated B ly 0.0494(C. Z. Chen et al. Science. 303(5654): 83-6 (2004)). mir-195 −0.0679 NA−0.053 vs. NA vs. 0.0692 0.0708 mir-221 −0.0812 −0.0839 −0.1157 vs.0.0343 cluster 221/222 vs. vs. 0.1292 0.1511 vs. −0.0111 0.0848 mir-222NA NA −0.022 vs. 0.0458 0.0288 vs. −0.0148 mir-223 0.0522 0.1036 vs.0.1056 vs. −0.1379 NA Normally vs. −0.0544 −0.1596 expression restrictedto myeloid lineage (C. Z. Chen et al. Science. 303(5654): 83-6 (2004)).Note: ZAP-70 negative = ZAP-70 expression ≦20%; ZAP-70 positive = ZAP-70expression >20%; IgV_(H) unmutated = homology ≧98%; IgV_(H) mutated =homology <98%. The numbers indicate the PAM scores in the two classes (nscore and y score). mir-29b-2 was previously named mir-102.

Association between miRNA expression and time to initial therapy.Treatment of patients according to the National Cancer Institute WorkingGroup criteria (B. D. Cheson et al. Blood. 87(12):4990-7 (1996)) wasperformed when symptomatic or progressive disease developed. Of the 94patients studied, 41 had initiated therapy (Table 13). We examined therelationship between the expression of 190 microRNA genes and either thetime from diagnosis to initial therapy (for patients that have beguntreatment) or from the time of diagnosis to the present (for thosepatients who haven't begun treatment), collectively representing thetotal group of 94 patients in the study. We found that the expressionprofile generated by a spectrum of 9 microRNAs, all components of theunique signature, can differentiate between two subsets of patients inthe group of 94 tested—one subset with a short interval from diagnosisto initial therapy and the second subset with a significantly longerinterval (see Table 14 and FIG. 5). The significance of Kaplan-Meiercurves improves if we restrict the analyses to the two main groups of 83patients (the Zap-70 positive/IgV_(H) Umut and Zap-70 negative/IgV_(H)Mut groups) or if we use only the 17 microRNAs from the signature (Pdecreases from <0.01 to P<0.005 and P<0.001, respectively). All of themicroRNAs which can predict the time to initial therapy, with theexception of mir-29c, are overexpressed in the group characterized by ashort interval from diagnosis to initial therapy.

Example 13 Identification of Sequence Alterations in miR GenesAssociated with CLL

Introduction

Using tumor DNA from CLL samples, we screened more than 700 kb of tumorDNAs (mean 39 patients/miRNA for mean 500 bp/miRNA) for sequencealterations in each of 35 different miR genes. Very rare polymorphismsor tumor specific mutations were identified in 4 of the 39 CLL cases,affecting one of three different miR genes: miR-16-1, miR-27b andmiR-206. In two other miR genes, miR-34b and miR-100, polymorphisms wereidentified in both CLL and normal samples with similar frequencies.

Materials and Methods

Detection of microRNA mutations. Thirty-five miR genes were analyzed forthe presence of a mutation, including 16 members of the miR expressionsignature identified in Example 12 (mir-15a, mir-16-1, mir-23a, mir-23b,mir-24-1, mir-24-2, mir-27a, mir-27b, mir-29b-2, mir-29c, mir-146,mir-155, mir-181a, mir-221, mir-222, mir-223) and 19 other miR genesselected randomly (let-7a2, let-7b, mir-21, mir-30a, mir-30b, mir-30c,mir-30d, mir-30e, mir-32, mir-100, mir-108, mir-125b1, mir-142-5p,mir-142-3p, mir-193, mir-181a, mir-206, mir-213 and mir-224).

The algorithm for screening for miR gene mutations in CLL samples wasperformed as follows: the genomic region corresponding to each precursormiRNA from either 39 CLL samples or 3 normal mononuclear cell samplesfrom healthy individuals was amplified, including at least 50 base pairsin the 5′ and 3′ extremities. For the miRNAs located in clusterscovering less than one kilobase, the entire corresponding genomic regionwas amplified and sequenced using the Applied Biosystems Model 377 DNAsequencing system (PE, Applied Biosystems, Foster City, Calif.). When adeviation from the normal sequence was found, a panel of blood DNAs from95 normal individuals was screened to confirm that the deviationrepresented a polymorphism. If the sequencing data were normal, anadditional panel of 37 CLL cases was screened to determine the frequencyof mutations in a total of 76 cancer patients. If additional mutationswere found, another set of 65 normal DNAs was screened, to assess thefrequency of the specific alteration in a total of 160 normal samples.

In vivo studies of mir-16-1 mutant effects. We constructed twomir-16-1/mir-15a expression vectors—one containing an 832 base pairgenomic sequence that included both mir-16-1 and mir-15a, and anothernearly identical construct containing the C to T mir-16-1 substitution,as shown in SEQ ID NO. 642—by ligating the relevant open reading framein a sense orientation into the mammalian expression vector, pSR-GFP-Neo(OligoEngine, Seattle, Wash.). These vectors are referred to asmir-16-1-WT and mir-16-1-MUT, respectively. All sequenced constructswere transfected into 293 cells using Lipofectamine 2000 according tothe manufacturer's protocol (Invitrogen, Carlsbad, Calif.). Theexpression of both mir-16-1-WT and mir-16-1-MUT constructs was assessedby Northern blotting as previously described (G. A. Calin et al., Proc.Natl. Acad. Sc. U.S.A. 99, 15524-15529 (2002)).

Results

Very rare polymorphisms or tumor specific mutations were identified in 4of the 39 CLL cases, affecting one of three different miR genes:miR-16-1, miR-27b and miR-206 (Tables 15 and 16). In two other miRgenes, miR-34b and miR-100, polymorphisms were identified in both CLLand normal samples with similar frequencies (see Tables 15, 16 andResults section below).

TABLE 15 Genetic variations in the genomic sequences of miR genes in CLLpatients. Other miRNA miRNA Mutation CLL (%) allele Normals CHIPObservation mir-16-1 C to T 2/76 (2.6) Deleted 0/160 (0) ReducedHeterozygous in (see SEQ ID (FISH, expression normal cells NO. 642) LOH)from both patients; Previous breast cancer; Mother died with CLL; sisterdied with breast cancer. mir-27b G to A 1/39 (2.6) Normal 0/98 (0)Normal (see SEQ ID expression NO. 646) mir-206 G to A 1/39 (2.6) NormalNA NA (see SEQ ID NO. 647) mir-100 G to A 17/39 (43.5) Normal 2/3 NA(see SEQ ID NO. 644)

TABLE 16 Sequences showing genetic variations in the miR genes of CLLpatients. Name Precursor Sequence (5′ to 3′) SEQ ID NO. hsa-mir-16-1-GTCAGCAGTGCCTTAGCAGCACGTAAATATTGGCGTT 641 normalAAGATTCTAAAATTATCTCCAGTATTAACTGTGCTGC TGAAGTAAGGTTGACCATACTCTAChsa-mir-16-1- GTCAGCAGTGCCTTAGCAGCACGTAAATATTGGCGTT 642 MUTAAGATTCTAAAATTATCTCCAGTATTAACTGTGCTGC TGAAGTAAGGTTGACCATACT T TAChsa-mir-100 CCTGTTGCCACAAACCCGTAGATCCGAACTTGTGGTA 643TTAGTCCGCACAAGCTTGTATCTATAGGTATGTGTCT GTTAGGCAATCTCACGGACC hsa-mir-100-CCTGTTGCCACAAACCCGTAGATCCGAACTTGTGGTA 644 MUTTTAGTCCGCACAAGCTTGTATCTATAGGTATGTGTCT GTTAGGCAATCTCAC A GACChsa-mir-27b- ACCTCTCTAACAAGGTGCAGAGCTTAGCTGATTGGTG 645 normalAACAGTGATTGGTTTCCGCTTTGTTCACAGTGGCTAAGTTCTGCACCTGAAGAGAAGGTGAGATGGGGACAGTTAAGTTGGAGCCGCTGGGGCAGAGGCCGTTGCTGAC GGGC hsa-mir-27b-ACCTCTCTAACAAGGTGCAGAGCTTAGCTGATTGGTG 646 MUTAACAGTGATTGGTTTCCGCTTTGTTCACAGTGGCTAAGTTCTGCACCTGAAGAGAAGGTGAGATGGGGACAGTTAAGTTGGAGCCGCTGGGGCAGAGGCCGTTGCTGAC A GGC has-mir-206TGCTTCCCGAGGCCACATGCTTCTTTATATCCCCATAT 230GGATTACTTTGCTATGGAATGTAAGGAAGTGTGTGGT TTCGGCAAGTG has-mir-206-TGCTTCCCGAGGCCACATGCTTCTTTATATCCCCATAT 647 MUT GGATTACTTT ACTATGGAATGTAAGGAAGTGTGTGGT TTCGGCAAGTG hsa-mir-34b-GTGCTCGGTTTGTAGGCAGTGTCATTAGCTGATTGTA 648 normalCTGTGGTGGTTACAATCACTAACTCCACTGCCATCAA AACAAGGCACAGCATCACCGCCGhsa-mir-34b- GTGCTCGGTTTGTAGGCAGTGTCATTAGCTGATTGTA 650 MUTCTGTGGTGGTTACAATCACTAACTCCACTGCCATCAA AACAAGGCACAGCATCACC A CCG Note:Each mutation/polymorphism is underlined and indicated in bold in thesequences marked “MUT”.

The miR-16-1 gene is located at 13q13.4. In 2 CLL patients out of 76screened (2.6%), we found a homozygous C to T polymorphism (compare SEQID NO: 641 to SEQ ID NO: 642; Table 16), which is located in a 3′ regionof the miR-16-1 precursor (FIG. 7C) with strong conservation in all ofthe primates analyzed (E. Berezikov et al., Cell 120(1):21-4 (2005)),suggesting that this polymorphism has functional implications. By RT-PCRand Northern blotting we have shown that the precursor miRNA includesthe 3′ region harboring the base substitution. Both patients have asignificant reduction in mir-16-1 expression in comparison with normalCD5+ cells by miRNACHIP and Northern blotting (FIG. 6, FIG. 7D). Furthersuggesting a pathogenic role, by FISH and LOH, we found a monoallelicdeletion at 13q14.3 in the majority of examined cells. This substitutionwas not found in any of 160 normal control samples (p<0.05 using chisquare analysis). In both patients, the normal cells from mucal mucosawere heterozygous for this abnormality. Therefore, this change is a veryrare polymorphism or a germ-line mutation. In support of the latter isthe fact that one of the patients has two relatives (mother and sister)who have been diagnosed with CLL and breast cancer, respectively.Therefore, this family fulfills the minimal criteria for “familial” CLL,i.e., two or more cases of B-CLL in first-degree living relatives (N.Ishibe et al., Leuk Lymphoma 42(1-2):99-108 (2001)).

To identify a possible pathogenic effect for this substitution, weinserted both the wild-type sequence of the mir-15a/mir-16-1 cluster, aswell as the mutated sequence, into separate expression vectors. Wetransfected 293 cells, which have a low endogenous expression of thiscluster. As a control, 293 cells transfected with an empty vector weretested. The expression levels of both mir-15a and mir-16-1 weresignificantly reduced in transfectants expressing the mutant constructin comparison to transfectants expressing the wild-type construct (FIG.7E). The level of expression in transfectants expressing the mutantconstruct was comparable with the level of endogenous expression in 293cells (FIG. 7E). Therefore, we conclude that the C to T change inmiR-16-1 affects the processing of the pre-miRNA in mature miRNA.

The miR-27b gene is located on chromosome 9. A heterozygous mutationcaused by a G to A change in the 3′ region of the miR-27b precursor(compare SEQ ID NO: 645 to SEQ ID NO: 646; Table 16), but within thetranscript of the 23b-27b-24-1 cluster, was identified in one out of 39CLL samples. miRCHIP analysis indicated that miR-27b expression wasreduced in this sample. This change has not been found in any of the 98normal individuals screened to date.

The miR-34b gene is located at 11q23. Four CLL patients out of 39carried two associated polymorphisms, a G to A polymorphism, as shown inSEQ. ID NO. 650, and a T to G polymorphism located in the 3′ region ofthe miR-34b precursor. Both polymorphisms were within the transcript ofthe mir-34b-mir-34c cluster. One patient was found to be homozygous(presenting by FISH heterozygous abnormal chromosome 11q23), while theother three were heterozygous for the polymorphisms. The same frequencyof mutation was found in 35 normal individuals tested.

Example 14 Identification of Abnormalities in the Genomic Sequences ofmiR Genes Associated with CLL

Introduction

Abnormally expressed cancer genes are frequently targets for geneticabnormalities, e.g., mutations that can either activate or inactivatetheir function. Therefore, we screened 42 microRNAs for germline orsomatic mutations.

Materials and Methods

Detection of MicroRNA Gene Mutations.

The genomic region corresponding to each precursor miRNA, including atleast 50 additional base pairs (bp) in the 5′ and 3′ extremities (i.e.,flanking sequences), was amplified from 40 CLL samples and normalmononuclear cell samples from 3 healthy individuals. For the miRNAslocated in clusters that were less than one kilobase (kb) in length, theentire corresponding genomic region was amplified and sequenced usingthe Applied Biosystems Model 377 DNA sequencing system (PE, AppliedBiosystems, Foster City, Calif.). When a deviation from the normalsequence was found, a panel of blood DNAs from 160 normal individuals,as well as an additional panel of 35 CLL cases (total of 75 leukemiapatients), were screened to confirm polymorphisms. All subjects wereCaucasian, as indicated by medical records of CLL patients andinformation obtained during an interview for control patients. For 46CLL patients, personal and/or familial cancer history was known.Forty-two miR genes were screened for germline or somatic mutations,including 15 members of the specific signature identified in Example 12,or members of the same cluster: miR-15a, miR-16-1, miR-23a, miR-23b,miR-24-1, miR-24-2, miR-27a, miR-27b, miR-29b-2, miR-29c, miR-146,miR-155, miR-221, miR-222, miR-223, as well as 27 other microRNAs thatwere selected randomly: let-7a2, let-7b, miR-17-3p, miR-17-5p, miR-18,miR-19a, miR-19b-1, miR-20, miR-21, miR-30b, miR-30c-1, miR-30d,miR-30e, miR-32, miR-100, miR-105-1, miR-108, miR-122, miR-125b-1,miR-142-5p, miR-142-3p, miR-193, miR-181a, miR-187, miR-206, miR-224,miR-346.

Results

Germline or somatic mutations were identified in miRNA genomic regionsin 11 out of 75 (15%) CLL samples. Five different miRNAs were affectedby mutations (5/42 miR genes analyzed, 12%): miR-16-1, miR-27b, miR-206,miR-29b-2 and miR-187. None of these mutations were found in a set of160 individuals without cancer (p<0.0001) (see Table 17). The positionsof the various mutations are shown relative to the position of the miRgene in FIG. 7A. All the abnormalities are localized in regions that aretranscribed, as shown by RT-PCR (FIG. 7B). Eight of the 11 (73%)patients with abnormal miRNA sequences have a known personal or familialhistory of CLL or other hematopoietic or solid tumors (Table 17).Sequences containing the identified miR gene mutations, as well as theircorresponding wild-type sequences, are shown in Table 16 for miR-16-1and miR-27b and in Table 18 for miR-29b-2, miR-187 and miR-206. Twomutations were identified in miR-29b-2 and miR-206 (labeled MUTT andMUT2, respectively, in Table 18). In addition, a polymorphism wasdetected in both CLL and normal samples with similar frequencies forthree other miR genes: miR-29c, miR-122a and miR-187 (labeld MUT2) (seeTables 17 and 18).

TABLE 17 Genetic variations in the genomic sequences of miR genes in CLLpatients. miRNACHIP miRNA Location** CLL Normals expression ObservationsmiR- Germline; 2/75 0/160 Reduced to Normal allele deleted in 16-1 pri-15% and 40% CLL cells in both miRNA: of normal, patients (FISH, LOH).CtoT respectively For one patient: History substitution of previousbreast at +7 bp in cancer; mother with CLL the 3′ (deceased); sisterwith flanking breast cancer (deceased). miR- Germline; 1/75 0/160 NormalMother with throat and 27b pri- lung cancer at age 58. miRNA: G Fatherwith lung cancer toA at age 57. substitution at +50bp in 3′ flankingsequence miR- pri- 1/75 0/160 Reduced to 75% Sister with breast cancer29b-2 miRNA: G at age 88 (still living). to A Brother with “some typesubstitution of blood cancer” at age at +212 in 70. 3′ flanking sequencemiR- pri- 3/75 0/160 Reduced to 80% Both patients have a 29b-2 miRNA: Afamily history of insertion at unspecified cancer. +107 in 3′ flankingsequence miR- pri- 1/75 0/160 NA Unknown 187 miRNA: T to C substitutionat +73 in 3′ flanking sequence miR- pre- 2/75 0/160 Reduced to 25%Prostate cancer; mother 206 miRNA: G with esophogeal cancer. to TBrother with prostate substitution cancer; sister with breast atposition cancer 49 of precursor miR- Somatic; 1/75 0/160 Reduced to 25%Aunt with leukemia 206 pri- (data only for one (deceased) miRNA: pt) Ato T substitution at −116 in 5′ flanking sequence miR- pri- 2/75 1/160NA Paternal grandmother 29c miRNA: G with CLL; sister with to A breastcancer. substitution at −31 in 5′ flanking sequence miR- pre- 1/75 2/160Reduced to 33% Paternal uncle with 122a miRNA: C colon cancer. to Tsubstitution at position 53 of precursor miR- pre- 1/75 1/160 NAGrandfather with 187 miRNA: G polycythemia vera. to A Father has ahistory of substitution cancer but not at position lymphoma. 34 ofprecursor For each CLL patient/normal control, more than 12 kb ofgenomic DNA was sequenced. In total, ~627 kb of tumor DNA and about 700kb of normal DNA was screened by direct sequencing. The positions of themutations are reported with respect to the precursor miRNA molecule.**When normal corresponding DNA from bucal mucosa was available, thealteration was identified as germline when present or somatic whenabsent, respectively. FISH—fluorescence in situ hybridization; LOH—lossof heterozygosity; NA—not available.

TABLE 18 Sequences showing genetic variations in the miR genes of CLLpatients. Precursor Sequence (5′ to 3′) +/− 5′ or 3′ flanking genomicSEQ ID Name sequence NO. hsa-mir-29b-2-CTTCTGGAAGCTGGTTTCACATGGTGGCTTAGATTTTT 651 normalCCATCTTTGTATCTAGCACCATTTGAAATCAGTGTTTTAGGAGTAAGAATTGCAGCACAGCCAAGGGTGGACTGCAGAGGAACTGCTGCTCATGGAACTGGCTCCTCTCCTCTTGCCACTTGAGTCTGTTCGAGAAGTCCAGGGAAGAACTTGAAGAGCAAAATACACTCTTGAGTTTGTTGGGTTTTGGGAGAGGTGACAGTAGAGAAGGGGGTTGTGTT TAAAATAAACACAGTGGCTTGAGCAGGGGCAGAGGhsa-mir-29b-2- CTTCTGGAAGCTGGTTTCACATGGTGGCTTAGATTTTT 652 MUT1 (G to ACCATCTTTGTATCTAGCACCATTTGAAATCAGTGTTTT substitution atAGGAGTAAGAATTGCAGCACAGCCAAGGGTGGACTG +212 in 3′CAGAGGAACTGCTGCTCATGGAACTGGCTCCTCTCCT flanking sequence)CTTGCCACTTGAGTCTGTTCGAGAAGTCCAGGGAAGAACTTGAAGAGCAAAATACACTCTTGAGTTTGTTGGGTTTTGGGAGAGGTGACAGTAGAGAAGGGGGTTGTGTT TAAAATAAACACAGTGGCTTGAGCAGGGGCAGA AG hsa-mir-29b-2- CTTCTGGAAGCTGGTTTCACATGGTGGCTTAGATTTTT 653 MUT2 (ACCATCTTTGTATCTAGCACCATTTGAAATCAGTGTTTT insertion at +107AGGAGTAAGAATTGCAGCACAGCCAAGGGTGGACTG in 3′ flankingCAGAGGAACTGCTGCTCATGGAACTGGCTCCTCTCCT sequence)CTTGCCACTTGAGTCTGTTCGAGAAGTCCAGGGAAGA A ACTTGAAGAGCAAAATACACTCTTGAGTTTGTTGGG TTTTGGGAGAGGTGACAGTAGAGAAGGGGGTTGTGTTTAAAATAAACACAGTGGCTTGAGCAGGGGCAGAGG hsa-mir-187-GGTCGGGCTCACCATGACACAGTGTGAGACCTCGGG 654 normalCTACAACACAGGACCCGGGCGCTGCTCTGACCCCTCGTGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCAGCAGAGCCTGCTCCGCTTGTCCTGAGGGACTCGACACAGGGGACTGCACAGAGACCATGGGAAAGTCCAGGCTC hsa-mir-187-GGTCGGGCTCACCATGACACAGTGTGAGACCTCGGG 655 MUT1 (T to CCTACAACACAGGACCCGGGCGCTGCTCTGACCCCTCG substitution at +73TGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCAG in 3′ flankingCAGAGCCTGCTCCGCTTGTCCTGAGGGACTCGACACA sequence)GGGGACTGCACAGAGACCATGGGAAAGTCCAGGC C C hsa-mir-187-GGTCGGGCTCACCATGACACAGTGTGAGACTCG A GC 656 MUT2 (G to ATACAACACAGGACCCGGGGCGCTGCTCTGACCCCTCG substitution atTGTCTTGTGTTGCAGCCGGAGGGACGCAGGTCCGCAG position 34 ofCAGAGCCTGCTCCGCTTGTCCTGAGGGACTCGACACA precursor)GGGGACTGCACAGAGACCATGGGAAAGTCCAGGCTC has-mir-206GATTTAGGATGAGTTGAGATCCCAGTGATCTTCTCGC 657TAAGAGTTTCCTGCCTGGGCAAGGAGGAAAGATGCTACAAGTGGCCCACTTCTGAGATGCGGGCTGCTTCTGGATGACACTGCTTCCCGAGGCCACATGCTTCTTTATATCCCCATATGGATTACTTTGCTATGGAATGTAAGGAAG TGTGTGGTTTCGGCAAGTG has-mir-206-GATTTAGGATGAGTTGAGATCCCAGTGATCTTCTCGC 658 MUT1 (G to TTAAGAGTTTCCTGCCTGGGCAAGGAGGAAAGATGCT substitution atACAAGTGGCCCACTTCTGAGATGCGGGCTGCTTCTGG position 49 ofATGACACTGCTTCCCGAGGCCACATGCTTCTTTATAT precursor) CCCCATATGGATTACTTT TCTATGGAATGTAAGGAAG TGTGTGGTTTCGGCAAGTG has-mir-206- G TTTTAGGATGAGTTGAGATCCCAGTGATCTTCTCGC 659 MUT2 (A to TTAAGAGTTTCCTGCCTGGGCAAGGAGGAAAGATGCT substitution atACAAGTGGCCCACTTCTGAGATGCGGGCTGCTTCTGG −116 in 5′ flankingATGACACTGCTTCCCGAGGCCACATGCTTCTTTATAT sequence)CCCCATATGGATTACTTTTCTATGGAATGTAAGGAAG TGTGTGGTTTCGGCAAGTG hsa-mir-29c-CGAGGTGCAGACCCTGGGAGCACCACTGGCCCATCT 660 normalCTTACACAGGCTGACCGATTTCTCCTGGTGTTCAGAGTCTGTTTTTGTCTAGCACCATTTGAAATCGGTTATGAT GTAGGGGGA hsa-mir-29c- C AAGGTGCAGACCCTGGGAGCACCACTGGCCCATCT 661 MUT (G to ACTTACACAGGCTGACCGATTTCTCCTGGTGTTCAGAG substitution at −31TCTGTTTTTGTCTAGCACCATTTGAAATCGGTTATGAT in 5′ flanking GTAGGGGGAsequence) hsa-mir-122a- CCTTAGCAGAGCTGTGGAGTGTGACAATGGTGTTTGT 662 normalGTCTAAACTATCAAACGCCATTATCACACTAAATAGC TACTGCTAGGC hsa-mir-122a-CCTTAGCAGAGCTGTGGAGTGTGACAATGGTGTTTGT 663 MUT (C to T GTCTAAACTATCAAA TGCCATTATCACACTAAATAGC substitution at TACTGCTAGGC position 53 ofprecursor) Note: The position of each mutation/polymorphism isunderlined and indicated in bold in the sequences marked “MUT”.

Example 15 A Unique MicroRNA Signature Associated with PrognosticFactors and Disease Progression in Chronic Lymphocytic Leukemia

Introduction: In spite of extensive effort, little is known regardingthe pathogenic events leading to the initiation and progression of Bcell CLL, the most frequent adult leukemia in the Western world. On thecontrary, several factors predicting the clinical course have beendefined. CLL cells with few or no mutations in the immunoglobulinheavy-chain variable-region gene (IgV_(H)) or with high expression ofthe 70-kD zeta-associated protein positive (ZAP-70+) have an aggressivecourse, whereas patients with mutated clones or few ZAP-70+ B cells havean indolent course (Chiorazzi, N., et al., N. Engl. J. Med. 352:804-815(2005)). It was also found that genomic aberrations in CLL are importantindependent predictors of disease progression and survival (Dohner, H.,et al., N. Engl. J. Med. 343(26):1910-1916 (2000)). However, themolecular basis of these associations is largely unknown. Here, weperformed genome wide expression profiling with the miRNACHIP in a largeseries of CLL samples with extensive clinical data to examine whetherexpression of these noncoding genes is associated with factorspredicting the clinical course.

Materials and Methods

Patient samples and clinical database. Samples used for this study aredescribed in detail in Example 12.

RNA extraction, Northern blots and miRNACHIP experiments. Procedureswere performed as described (Calin, G. A., et al., Proc. Natl. Acad.Sci. USA 101(32):1175-1160 (2004); Liu, C.-G., et al., Proc. Natl. Acad.Sci. USA 101(26): 9740-9744 (2004)). Briefly, labeled targets from 5 μgof total RNA was used for hybridization on each miRNACHIP microarraychip containing 368 probes in triplicate, corresponding to 245 human andmouse miRNA genes. Of note, for 76 microRNAs on the miRNACHIP twospecific oligos were synthesized one identifying the active 22 nt partof the molecule and the other for the 60-110 nt precursor (Liu, C.-G.,et al., Proc. Natl. Acad. Sci. USA 101(26): 9740-9744 (2004)).

Data analysis. Raw data were normalized and analyzed in GeneSpring®software version 7.2 (Silicon Genetics, Redwood City, Calif.).Expression data were median centered using both GeneSpring normalizationoption or Global Median normalization of the Bioconductor package,without any substantial difference. Statistical comparisons were doneboth using the GeneSpring ANOVA tool and the SAM software (SignificanceAnalysis of Microarray). mRNA predictors were calculated by using PAMsoftware (Prediction Analysis of Microarrays); the Support VectorMachine tool of GeneSpring was used for the Cross-validation andTest-set prediction. The Kaplan-Meier plot (“survival analysis” of thePAM software) was used to identify an association between miRNAexpression and the time elapsing from CLL diagnosis and the beginning oftherapy. miRNAs able to best separate the two groups were identified atthe same time. All data were submitted using MIAMExpress to the ArrayExpress database (accession numbers to be received upon revision). Wevalidated the microarray data for 4 miRNAs (miR-16-1, miR-26a, miR-206and miR-223) in 11 CLL samples and normal CD5 cells by solutionhybridization detection as presented elsewhere (Calin, G. A., et al.,Proc. Natl. Acad. Sci. USA 101(32):11755-11760 (2004)). Furthermore,miR-15a and miR-16-1 expression in the patients with germline mutationwas confirmed by Northern blot.

Analysis of ZAP-70 and Sequence analysis of expressed IgV_(H). Theseexperiments were performed as described in Example 12.

Results

Comparison of miRNA expression in Zap-70 positive/IgV_(H) Umut vs.Zap-70 negative/IgV_(H) Mut CLL cells. In Example 12, a unique signaturethat can discriminate between the two main groups of CLL patients (i.e.,Zap70 positive/IgV_(H) Umut and Zap-70 negative/IgV_(H) Mut), composedof 17 genes, was identified using the PAM package. Using additionalalgorithms for statistical and prediction analysis (i.e., SAM andGeneSpring) to validate the PAM signature, we found that a signaturecomposed of 13 mature microRNAs could discriminate (at P<0.01) betweenZap70 positive/IgV_(H) Umut and Zap-70 negative/IgV_(H) Mut patients(Tables 19 and 20). Furthermore, the prediction made using SupportVector Machine correctly classified all patients (Table 20). Themajority of miRNAs (9 out of 13) were significantly overexpressed in thegroup with poor prognosis. The 10 patients belonging to the Zap-70positive and VhMut group were equally assigned to groups good or poorprognosis, suggesting either that there are no microRNAs on themiRNACHIP whose expression can distinguish these two groups, that thesetwo groups are not different with regard to microRNA expression profilesor that the groups are too small to be correctly classified.

We used the Support Vector Machine algorithm also to predict anadditional independent set of 50 CLL samples with known ZAP-70 status(Table 21). When the 13 miRNAs of the identified signature were used,the prediction was made correctly in all cases, confirming, therebyconfirming our results. Also confirming the microarray specificity, asreported in Liu, C.-G., et al., Proc. Natl. Acad. Sci. USA 101(26):9740-9744 (2004), the signature did not include very similar members ofthe same families, such as miR-23a (1 base difference from miR-23b) andmiR-15b (four bases difference from miR-15a), while the identical maturemiRNAs miR-16-1 and miR-16-2 were both identified, indicating that thechip is able to discriminate between highly similar iso forms.

TABLE 19 miRNA signature associated with prognostic factors (ZAP70 andIgVH mutations) and disease progression in CLL patients*. Group 4 Nr.Crt. Component Map P value expression** Putative targets***Observation**** 1 miR-15a 13q14.3 0.018 high NA cluster 15a/16-1 del CLL& Prostate ca. 2 miR-195 17p13 0.017 high NA del HCC 3 miR-221 Xp11.30.010 high HECTD2, CDKN1B, NOVA1, cluster 221/222 ZFPM2, PHF2 4 miR-23b9q22.1 0.009 high FNBP1L, WTAP, cluster 24-1/23b PDE4B, SATB1, SEMA6DFRA 9D; del Urothelial ca. 5 miR-155 21q21 0.009 high ZNF537, PICALM,RREB1, amp child Burkitt's lymphoma BDNF, QKI 6 miR-223 Xq12-13.3 0.007low PTBP2, SYNCRIP, WTAP, normally expression restricted to myeloidFBXW7, QKI lineage 7 miR-29a-2 7q32 0.004 low NA cluster 29a-2/29b-1FRA7H; del Prostate ca. 8 miR-24-1 9q22.1 0.003 high TOP1, FLJ45187,RSBN1L, cluster 24-1/23b RAP2C, PRPF4B FRA 9D; del Urothelial ca. 9miR-29b-2 1q32.2-32.3 0.0007 low NA (miR-102) 10 miR-146 5q34 0.0007high NOVA1, NFE2L1, C1orf16, ABL2, ZFYVE1 11 miR-16-1 13q14.3 0.0004high BCL2, CNOT6L, USP15, cluster 15a/16-1 PAFAH1B1, ESRRG del CLL,prostate ca. 12 miR-16-2 3q26.1 0.0003 high see miR-16-1 identicalmiR-16-1 13 miR-29c 1q32.2-32.3 0.0002 low NA Note: *All the members ofthe signature are mature miRNAs; **Group 4 includes patients with IgVhmutated and Zap-70 negative, both predictors of poor prognosis. ***topfive predictions using TargetScan (Lewis, B. P., et al., Cell 120: 15-20(2005)) were included. NA —not available; for specific gene names - seethe NCBI site. ****FRA = fragile site; del = deletion; HCC =hepatocellular carcinoma; ca. = carcinoma.

TABLE 20 List of miRNAs associated with prognostic factors and diseaseprogression in CLL patients selected by Prediction Analysis ofMicroarrays (PAM) and ANOVA analysis (GeneSpring)*. Nr. PAM n− y+GeneSpring Anova crt. signature score score map signature p-value map 1mir-222 −0.022 0.0288 Xp11.2 mir-34-prec 0.048 1p36.22 2 mir-24-2−0.0272 0.0355 19p13.12 mir-192-2/3-prec 0.0457 11q13 3 mir-181a −0.02790.0364 1q32.1 mir-15a-prec 0.0353 13q14.3 4 mir-15a −0.0372 0.048513q14.3 mir-17 0.0257 13q31 5 mir-24-1 −0.0427 0.0558 9q22.1 mir-15a0.018 13q14.3 6 mir-195 −0.053 0.0692 17p13 mir-195 0.0175 17p13 7mir-23a −0.0748 0.0977 19p13.12 mir-213-prec 0.0153 1q31.3-q32.1 8mir-23b −0.0909 0.1187 9q22.1 mir-221 0.0105 Xp11.3 9 mir-223 0.1056−0.1379 Xq12-13.3 mir-023b 0.00964 9q22.1 10 mir-29a-2 0.1139 −0.14877q32 mir-155 0.00959 21q21 11 mir-155 −0.1155 0.1508 21q21 mir-2230.00774 Xq12-13.3 12 mir-221 −0.1157 0.1511 Xp11.3 mir-132 0.0046117p13.3 13 mir-16-1 −0.1444 0.1886 13q14.3 mir-029a-2 0.00446 7q32 14mir-16-2 −0.1619 0.2113 3q26.1 mir-024-1 0.00311 9q22.1 15 mir-146−0.1803 0.2354 5q34 mir-29b-2 (102) 0.000778 1q32.2-32.3 16 mir-29b-20.2065 −0.2696 1q32.2-32.3 mir-146 0.000753 5q34 (102) 17 mir-029c0.2174 −0.2839 1q32.2-32.3 mir-016-1 0.00042 13q14.3 18 mir-016-20.000327 3q26.1 19 mir-029c 0.000216 1q32.2-32.3 *the list of genes isin ascending order of significance, as represented by score or p value,respectively.

TABLE 21 Predictions of ZAP-70 status and Immunoglobulin heavy chainvariable gene status according to miRNA expression in CLL patients*. CLLTrue Value Prediction n margin y margin PANEL 1 - 83 correctpredictions, CLL01 Zap70 < 20 VhM Zap70 < 20 VhM 1.278 −1.327 0incorrect predictions CLL02 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL03Zap70 < 20 VhM Zap70 < 20 VhM 1.247 −1.348 CLL04 Zap70 < 20 VhM Zap70 <20 VhM 1.16 −1.388 CLL05 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL06 Zap70< 20 VhM Zap70 < 20 VhM 1 −1 CLL07 Zap70 < 20 VhM Zap70 < 20 VhM 1−1.122 CLL08 Zap70 < 20 VhM Zap70 < 20 VhM 1.391 −1.595 CLL09 Zap70 < 20VhM Zap70 < 20 VhM 0.953 −1.048 CLL10 Zap70 < 20 VhM Zap70 < 20 VhM1.059 −1.333 CLL11 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL12 Zap70 < 20VhM Zap70 < 20 VhM 0.997 −1.261 CLL13 Zap70 < 20 VhM Zap70 < 20 VhM1.488 −1.841 CLL14 Zap70 < 20 VhM Zap70 < 20 VhM 2.171 −2.582 CLL15Zap70 < 20 VhM Zap70 < 20 VhM 1.252 −1.352 CLL16 Zap70 < 20 VhM Zap70 <20 VhM 1 −1.188 CLL17 Zap70 < 20 VhM Zap70 < 20 VhM 1.19 −1.284 CLL18Zap70 < 20 VhM Zap70 < 20 VhM 1.747 −2.062 CLL19 Zap70 < 20 VhM Zap70 <20 VhM 1.503 −1.833 CLL20 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL21 Zap70< 20 VhM Zap70 < 20 VhM 1 −1 CLL22 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1CLL23 Zap70 < 20 VhM Zap70 < 20 VhM 2.047 −2.27 CLL24 Zap70 < 20 VhMZap70 < 20 VhM 1.464 −1.527 CLL25 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1CLL26 Zap70 < 20 VhM Zap70 < 20 VhM 1.034 −1.034 CLL27 Zap70 < 20 VhMZap70 < 20 VhM 1.479 −1.617 CLL28 Zap70 < 20 VhM Zap70 < 20 VhM 2.355−2.57 CLL29 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL30 Zap70 < 20 VhMZap70 < 20 VhM 1 −1 CLL31 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL32 Zap70< 20 VhM Zap70 < 20 VhM 1 −1 CLL33 Zap70 < 20 VhM Zap70 < 20 VhM 2.229−2.496 CLL34 Zap70 < 20 VhM Zap70 < 20 VhM 2.683 −2.931 CLL35 Zap70 < 20VhM Zap70 < 20 VhM 1 −1 CLL36 Zap70 < 20 VhM Zap70 < 20 VhM 2.578 −2.768CLL37 Zap70 < 20 VhM Zap70 < 20 VhM 2.079 −2.34 CLL38 Zap70 < 20 VhMZap70 < 20 VhM 1.745 −1.814 CLL39 Zap70 < 20 VhM Zap70 < 20 VhM 1.559−1.699 CLL40 Zap70 < 20 VhM Zap70 < 20 VhM 2.608 −3.005 CLL41 Zap70 < 20VhM Zap70 < 20 VhM 2.357 −2.676 CLL42 Zap70 < 20 VhM Zap70 < 20 VhM1.102 −1.303 CLL43 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL44 Zap70 < 20VhM Zap70 < 20 VhM 2.464 −2.629 CLL45 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1CLL46 Zap70 < 20 VhM Zap70 < 20 VhM 1 −1 CLL47 Zap70 < 20 VhM Zap70 < 20VhM 2.074 −2.271 CLL48 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL49Zap70 > 20 VhUM Zap70 > 20 VhUM −1.179 1.487 CLL50 Zap70 > 20 VhUMZap70 > 20 VhUM −1 0.88 CLL51 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL52Zap70 > 20 VhUM Zap70 > 20 VhUM −1.836 2.405 CLL53 Zap70 > 20 VhUMZap70 > 20 VhUM −1 1 CLL54 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.334 1.649CLL55 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1.229 CLL56 Zap70 > 20 VhUMZap70 > 20 VhUM −1 1 CLL57 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL58Zap70 > 20 VhUM Zap70 > 20 VhUM −1.171 1.566 CLL59 Zap70 > 20 VhUMZap70 > 20 VhUM −1 1 CLL60 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL61Zap70 > 20 VhUM Zap70 > 20 VhUM −1.505 1.976 CLL62 Zap70 > 20 VhUMZap70 > 20 VhUM −1.095 1.46 CLL63 Zap70 > 20 VhUM Zap70 > 20 VhUM −2.2972.717 CLL64 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.187 1.381 CLL65 Zap70 >20 VhUM Zap70 > 20 VhUM −1 1 CLL66 Zap70 > 20 VhUM Zap70 > 20 VhUM−1.344 1.479 CLL67 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.876 2.049 CLL68Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL69 Zap70 > 20 VhUM Zap70 > 20VhUM −1.89 1.987 CLL70 Zap70 > 20 VhUM Zap70 > 20 VhUM −2.658 2.938CLL71 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.556 1.967 CLL72 Zap70 > 20 VhUMZap70 > 20 VhUM −2.574 2.81 CLL73 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1CLL74 Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1 CLL75 Zap70 > 20 VhUM Zap70 >20 VhUM −1 1 CLL76 Zap70 > 20 VhUM Zap70 > 20 VhUM −2.671 3.041 CLL77Zap70 > 20 VhUM Zap70 > 20 VhUM −1 1.376 CLL78 Zap70 > 20 VhUM Zap70 >20 VhUM −1 1 CLL79 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.678 1.914 CLL80Zap70 > 20 VhUM Zap70 > 20 VhUM −2.416 2.953 CLL81 Zap70 > 20 VhUMZap70 > 20 VhUM −1 1 CLL82 Zap70 > 20 VhUM Zap70 > 20 VhUM −1.782 1.846CLL83 Zap70 > 20 VhUM Zap70 > 20 VhUM −2.307 2.716 PANEL 2 - 50 correctpredictions, 0 incorrect predictions CLL95 Zap70 < 20 Zap70 < 20 8.494−8.494 CLL96 Zap70 < 20 Zap70 < 20 1 −1 CLL97 Zap70 < 20 Zap70 < 200.763 −0.763 CLL98 Zap70 < 20 Zap70 < 20 11.19 −11.19 CLL99 Zap70 < 20Zap70 < 20 7.561 −7.561 CLL100 Zap70 < 20 Zap70 < 20 14.51 −14.51 CLL101Zap70 < 20 Zap70 < 20 5.585 −5.585 CLL102 Zap70 < 20 Zap70 < 20 1 −1CLL103 Zap70 < 20 Zap70 < 20 10.09 −10.09 CLL104 Zap70 < 20 Zap70 < 205.521 −5.521 CLL105 Zap70 < 20 Zap70 < 20 7.33 −7.33 CLL106 Zap70 < 20Zap70 < 20 3.264 −3.264 CLL107 Zap70 < 20 Zap70 < 20 7.774 −7.774 CLL108Zap70 < 20 Zap70 < 20 5.3 −5.3 CLL109 Zap70 < 20 Zap70 < 20 4.34 −4.34CLL110 Zap70 < 20 Zap70 < 20 1.822 −1.822 CLL111 Zap70 < 20 Zap70 < 203.879 −3.879 CLL112 Zap70 < 20 Zap70 < 20 8.514 −8.514 CLL113 Zap70 < 20Zap70 < 20 5.866 −5.866 CLL114 Zap70 < 20 Zap70 < 20 10.69 −10.69 CLL115Zap70 < 20 Zap70 < 20 4.141 −4.141 CLL116 Zap70 < 20 Zap70 < 20 1 −1CLL117 Zap70 < 20 Zap70 < 20 1 −1 CLL118 Zap70 < 20 Zap70 < 20 1 −1CLL119 Zap70 < 20 Zap70 < 20 10.11 −10.11 CLL120 Zap70 > 20 Zap70 > 20−3.109 3.109 CLL121 Zap70 > 20 Zap70 > 20 −4.722 4.722 CLL122 Zap70 > 20Zap70 > 20 −5.166 5.166 CLL123 Zap70 > 20 Zap70 > 20 −7.828 7.828 CLL124Zap70 > 20 Zap70 > 20 −7.468 7.468 CLL125 Zap70 > 20 Zap70 > 20 −11.4411.44 CLL126 Zap70 > 20 Zap70 > 20 −1 1 CLL127 Zap70 > 20 Zap70 > 20−6.617 6.617 CLL128 Zap70 > 20 Zap70 > 20 −7.011 7.011 CLL129 Zap70 > 20Zap70 > 20 −7.479 7.479 CLL130 Zap70 > 20 Zap70 > 20 −9.568 9.568 CLL131Zap70 > 20 Zap70 > 20 −5.286 5.286 CLL132 Zap70 > 20 Zap70 > 20 −5.0455.045 CLL133 Zap70 > 20 Zap70 > 20 −1 1 CLL134 Zap70 > 20 Zap70 > 20 −11 CLL135 Zap70 > 20 Zap70 > 20 −1.324 1.324 CLL136 Zap70 > 20 Zap70 > 20−1 1 CLL137 Zap70 > 20 Zap70 > 20 −1 1 CLL138 Zap70 > 20 Zap70 > 20−9.649 9.649 CLL139 Zap70 > 20 Zap70 > 20 −9.264 9.264 CLL140 Zap70 > 20Zap70 > 20 −7.13 7.13 CLL141 Zap70 > 20 Zap70 > 20 −11.77 11.77 CLL142Zap70 > 20 Zap70 > 20 −2.986 2.986 CLL143 Zap70 > 20 Zap70 > 20 −1 1CLL144 Zap70 > 20 Zap70 > 20 −1 1 *Prediction for 83 CLLs, from groups 1and 4 (see text). Classification was generated by the ‘Support VectorMachines’ algorithm (Kernel Function used: Polynomial Dot Product (Order2). Diagonal Scaling Factor: 0). The miRNA signature associated withprognostic factors was generated using panel 1 samples and then testedto cross validate the panel 1 and to predict the status of samples frompanel 2.Association Between miRNA Expression and Time to Initial Therapy.

This analysis was performed as described in Example 12. All of themicroRNAs which can predict the time to initial therapy, with theexception of mir-29c, are overexpressed in the group characterized by ashort interval from diagnosis to initial therapy (Table 22). The PAMscore for each of the components of microRNA signature associated withthe time from diagnosis to initial therapy is presented in Table 23.

TABLE 22 Relative expression levels of microRNAs predictive of the timeinterval from diagnosis to initial therapy. Short Long interval intervalmicroarray expression hsa-mir-181a High Low hsa-mir-155 High Lowhsa-mir-146 High Low hsa-mir-024-2 High Low hsa-mir-023b High Lowhsa-mir-023a High Low hsa-mir-222 High Low hsa-mir-221 High Lowhsa-mir-029c Low High

TABLE 23 PAM score for each of the components of microRNA signatureassociated with the time from diagnosis to initial therapy. 1 score 2score hsa-mir-181a 0.1862 −0.0603 hsa-mir-155 0.1409 −0.0456 hsa-mir-1460.07 −0.0227 hsa-mir-024-2 0.0696 −0.0225 hsa-mir-023b 0.0643 −0.0208hsa-mir-023a 0.0587 −0.019 hsa-mir-222 0.0458 −0.0148 hsa-mir-221 0.0343−0.0111 hsa-mir-029c −0.0221 0.0072 Note: Score 1 characterize the shorttime; score 2 the long time from diagnosis to initial therapy in a panelof 94 CLL patients.

The relevant teachings of all publications cited herein that have notexplicitly been incorporated by reference, are incorporated herein byreference in their entirety. One skilled in the art will readilyappreciate that the present invention is well adapted to carry out theobjects and obtain the ends and advantages mentioned, as well as thoseinherent therein. The present invention may be embodied in otherspecific forms without departing from the spirit or essential attributesthereof and, accordingly, reference should be made to the appendedclaims, rather than to the foregoing specification, as indicating thescope of the invention.

1. A method of determining increased risk of a human subject developingB-cell chronic lymphocytic leukemia (CLL) or increased likelihood of thepresence of CLL in a human subject comprising the steps of: (i)measuring in a blood sample from the subject the level of one or moremicroRNAs selected from the group consisting of miR-029a microRNA,miR-029b-2 microRNA and miR-029c microRNA; and (ii) comparing the levelof the one or more microRNAs in the sample to a control level of the oneor more microRNAs, wherein an increase in the level of the one or moremicroRNAs in the sample relative to the control level of the one or moremicroRNAs is indicative of the subject either having increased risk ofdeveloping CLL or increased likelihood of the presence of CLL.
 2. Themethod of claim 1, wherein the level of the one or more microRNAs in thesample is measured using an assay selected from the group consisting ofnorthern blot analysis, in situ hybridization and quantitative reversetranscriptase polymerase chain reaction.
 3. The method of claim 1,wherein the subject has an unmutated IgV_(H) gene, positive ZAP-70expression, or a combination thereof.
 4. The method of claim 1, whereinthe miR-029a microRNA comprises SEQ ID NO:67 or SEQ ID NO:68.
 5. Themethod of claim 1, wherein the miR-029a microRNA comprises nucleotides41-62 of SEQ ID NO:68.
 6. The method of claim 1, wherein the miR-029b-2microRNA comprises SEQ ID NO:251.
 7. The method of claim 1, wherein themiR-029b-2 microRNA comprises nucleotides 51-70 of SEQ ID NO:251.
 8. Themethod of claim 1, wherein the miR-029c microRNA comprises SEQ ID NO:69.9. The method of claim 1, wherein the miR-029c microRNA comprisesnucleotides 64-85 of SEQ ID NO:69.
 10. A method of determining increasedrisk of a human subject developing B-cell chronic lymphocytic leukemia(CLL) or increased likelihood of the presence of CLL in a human subjectcomprising the steps of: (1) reverse transcribing one or more microRNAsselected from the group consisting of miR-029a microRNA, miR-029b-2microRNA and miR-029c microRNA from a blood sample from the subject toprovide at least one target oligodeoxynucleotide for each of the one ormore microRNAs; (2) hybridizing the at least one targetoligodeoxynucleotide for each of the one or more microRNAs to amicroarray comprising miRNA-specific probe oligonucleotides that includeat least one microRNA-specific probe oligonucleotide for each of the oneor more microRNAs to provide a hybridization profile for the sample,wherein the hybridization profile includes a hybridization signal foreach of the one or more microRNAs; and (3) comparing the samplehybridization profile to a control hybridization profile, wherein thecontrol hybridization profile includes a control hybridization signalfor each of the one or more microRNAs, wherein a hybridization signal ofthe one or more microRNAs in the sample hybridization profile that isgreater than the control hybridization signal for the microRNA in thecontrol hybridization profile indicates the subject has increased riskof developing CLL or increased likelihood of the presence of CLL. 11.The method of claim 10, wherein the microarray comprises miRNA-specificprobes oligonucleotides for a substantial portion of the human miRNome.12. The method of claim 10, wherein the subject has an unmutated IgV_(H)gene, positive ZAP-70 expression, or a combination thereof.
 13. A methodof determining increased risk of a human subject developing B-cellchronic lymphocytic leukemia (CLL) associated with one or more adverseprognostic markers or increased likelihood of the presence of CLLassociated with one or more adverse prognostic markers in a humansubject comprising the steps of: (1) reverse transcribing one or moremicroRNAs selected from the group consisting of miR-029a microRNA,miR-029b-2 microRNA and miR-029c microRNA from a blood sample from thesubject to provide at least one target oligodeoxynucleotide for each ofthe one or more microRNAs; (2) hybridizing the at least one targetoligodeoxynucleotide for each of the one or more microRNAs to amicroarray comprising miRNA-specific probe oligonucleotides that includeat least one microRNA-specific probe oligonucleotide for each of the oneor more microRNAs to provide a hybridization profile for the sample,wherein the hybridization profile includes a hybridization signal foreach of the one or more microRNAs; and (3) comparing the samplehybridization profile to a control hybridization profile, wherein thecontrol hybridization profile includes a control hybridization signalfor each of the one or more microRNAs, wherein a hybridization signal ofthe one or more microRNAs in the sample hybridization profile that isgreater than the control hybridization signal for the microRNA in thecontrol hybridization profile indicates the subject has increased riskof developing CLL associated with one or more adverse prognostic markersor increased likelihood of the presence of CLL associated with one ormore adverse prognostic markers.
 14. The method of claim 13, wherein theone or more adverse prognostic markers are selected from the groupconsisting of: positive ZAP-70 expression; and an unmutated IgV_(H)gene; or a combination thereof.
 15. The method of claim 13, wherein thesubject has an unmutated IgV_(H) gene, positive ZAP-70 expression, or acombination thereof.